Download as pdf or txt
Download as pdf or txt
You are on page 1of 632

PHILOSOPHY, SCIENCE, AND METHOD

Essays in Honor of

Ernest Nagel

PHILOSOPHY,
SCIENCE, AND
METHOD
Editors: SIDNEY MORGENBESSER
Columbia University

PATRICK SUPPES
Stanford University

MORTON WHITE
Harvard University

St. Martin s Press, New York


COPYRIGHT © 1969 BY ST. MARTIN’S PRESS, INC.

ALL RIGHTS RESERVED. NO PART OF THIS BOOK MAY BE REPRODUCED IN ANY


FORM WITHOUT PERMISSION FROM THE PUBLISHER.

LIBRARY OF CONGRESS CATALOG CARD NUMBER 69-17855

DESIGNED BY CHARLES KAPLAN

MANUFACTURED IN THE UNITED STATES OF AMERICA


PREFACE

Ernest Nagel is one of this generation’s most distinguished philosophers


of science and one of its most effective spokesmen for critical naturalism.
His philosophy of science and his commitment to naturalism are intimately
connected, for he believes that the methods and results of science vindicate
the spirit of classical naturalism, and that modern naturalism is intolerably
thin when it is not informed by considerable familiarity with those methods
and results. Nagel therefore insists that theorists of knowledge should care¬
fully examine the logic of science; and because he doubts the value of
detached metaphysical speculation, he holds that philosophers must acquaint
themselves with what scientists say and do in order to analyze such concepts
as space, time, and causality. In a similar vein he holds that the conclusions
and self-corrective methods of science lend support to social and political
liberalism. Ernest Nagel is, then, a remarkably wide-ranging analytical
philosopher, who has combined logical power and scientific learning in his
admirable effort to formulate principles that should guide rational inquiry
and rational living.
Nagel has conducted this effort by using more than the written word.
He has been a most influential teacher and an ideal intellectual companion
to many philosophers, scientists, and men of affairs; he has not only ex¬
hibited but also rekindled in others commitments to clarity and concern
for truth and for the need to integrate vision and technique. We hope that
the essays here collected, all written since 1966 and some finished that year,
will bring pleasure to the man they are intended to honor. All of them are
concerned with issues that have been of interest to Ernest Nagel—the
nature of the scientific enterprise, the conditions for warranted belief and
justified action, the tenability of pragmatic theses about logic and scientific
theories, and the scope and limitations of scientific approaches to discus¬
sions of human affairs and human action. It is also heartening to note that
many of the essays have been contributed by historians, scientists, linguists,
and legal theorists who, by exhibiting the philosophical relevance of issues
in their field, confirm Ernest Nagel’s conviction that philosophers may learn
much from the work of scientists and scholars. Although there is wide
agreement among the contributors concerning the method and style of
v
vi PREFACE

philosophy, there is no orthodoxy and there is occasional dissent from some


views that Ernest Nagel has defended. We are sure he would not want
it any other way.
CONTENTS

Preface

1. SCIENCE AND INQUIRY

Nagel’s Lectures on Dewey’s Logic 2 PATRICK SUPPES

Some Difficulties in Knowing 26 STUART HAMPSHIRE

On Objective Intensions and the


Law of Inverse Variation 48 R. M. MARTIN

Confirmation of Laws 74 MARY HESSE

Induction and the Aims of Inquiry 92 ISAAC LEVI

Concepts of Statistical Evidence 112 ALLAN BIRNBAUM

Some Half-Baked Thoughts


about Induction 144 MAX BLACK

STRUCTURE OF SCIENCE

On What There May


Be in the World 152 G. FEINBERG

On Cartesian and Darwinian


Aspects of Biology 165 THEODOSIUS DOBZHANSKY

Reduction: Ontological and


Linguistic Facets 179 CARL G. HEMPEL

The Realist-Instrumentalist
Controversy 200 SIDNEY MORGENBESSER
viii CONTENTS

The Identity Thesis 219 judith jarvis Thomson

Extensive Measurement When Con¬


catenation Is Restricted and r. duncan luce and

Maximal Elements May Exist 235 A. A. J. MARLEY

Causation and Action 250 MORTON WHITE

Some Empirical Assumptions in


Modern Philosophy of Language 260 NOAM CHOMSKY

If Matter Could Talk 286 FRITZ MACHLUP

Functionalism in Social Anthro¬


pology 306 MAURICE MANDELBAUM

Legal Theory: Law, Justice, Ethics,


and Social Morality 333 Wolfgang friedmann

3. THE CONSTRUCTION OF
THE GOOD

Metaphors, Analogies, Models, and


All That, in Ethical Theory 364 ABRAHAM EDEL

Absolutism and Human Rights 382 SIDNEY HOOK

Justice and Rationality 400 CHARLES FRANKEL

Beyond Pareto Optimality 415 JAMES S. COLEMAN

Coercion 440 ROBERT NOZICK

Existentialism and Death: A Survey


of Some Confusions and Absurdities 473 paul edwards

4. HISTORICAL STUDIES

The Quadrature by Lunes in the


Later Middle Ages 508 marshall clagett

Isaac Newton’s Principia, the Scrip¬


tures, and the Divine Providence 523 i. Bernard cohen
CONTENTS ix

The Law of Inertia: Some Remarks


on Its Structure and Significance 549 ARNOLD KOSLOW

Kant’s Philosophy of Arithmetic 568 CHARLES PARSONS

A Soviet Philosopher’s View of


Peirce’s Pragmatism 595 PHILIP P. WIENER

John D. Rockefeller and


the Historians 602 SIGMUND DIAMOND
■J'
1. Science and
Inquiry
NAGEL’S LECTURES ON DEWEY’S LOGIC
Patrick Suppes

I first encountered Ernest Nagel in the late winter of 1947 when I


entered Columbia as a graduate student. The lectures I attended were
those of the course he entitled Types of Logical Theory, and during that
term he devoted the lectures to Bradley and Dewey. I listened eagerly and
with pleasure to everything that Nagel had to say about Bradley and
Dewey, and I marveled at the patient way he went about dissecting their
views and stressing the weak points in their arguments. Nagel’s lectures
on Dewey have never been published, and because I increasingly see the
importance of what Dewey was trying to do, it seemed more than ap¬
propriate to give here an account of Nagel’s lectures. I have shown the
date of each lecture as a method of indicating natural breaks in the
narrative.

MARCH 10

Nagel finished his lectures on Bradley by making the well-known point


that Bradley seems to deny the essential validity of discursive discourse,
and he opened his lectures on Dewey by pointing out that this was an aspect
of Bradley’s thought that Dewey had absorbed and accepted. Before turn¬
ing to any details of the logic itself, Nagel made some general remarks
about Dewey’s motivation and general orientation. He noted four important
influences on Dewey’s thought that should be taken account of in ex¬
amining the Logic. The first was the importance that Dewey attached to
modern mathematics and experimental science; the second was the in¬
fluence of biological conceptions, particularly those of Darwin; the third
was the rejection of psychological empiricism; and the fourth was concern
with the social import of thought.
Concerning the first point about the importance of modern mathematics
and experimental science, Nagel said that Dewey inveighs against the view
that science gives us a final grasp of things. Dewey’s arguments are
directed against Aristotle and the tradition of British psychological
empiricism, which holds that sense gives us a final grasp of things. In
this connection as in others, however, one of the difficulties of Dewey’s
2
PATRICK SUPPES 3

analysis is that he is usually directing a specific polemic at various un¬


named schools. His two main foes are traditional rationalists and traditional
empiricists, because he wants to eliminate the sharp dichotomy between
reason and sense. According to Nagel, Dewey’s attack on classical rational¬
ism has been fortified by the fact that many of the alleged first principles
of various sciences have subsequently been replaced by other first principles,
as, for example, in geometry.
Concerning the influence of biological conceptions in Dewey’s thought,
Nagel remarked that it is evident that Dewey emphasizes a genetic ap¬
proach in most of his works. Dewey, Nagel said, has been reproached for
confusing genesis with validity, and yet much can be said for Dewey’s
approach as a fruitful method. Dewey is dependent on Darwin for his
emphasis on change and, in particular, for the thesis that structure is not
fixed but functional. Dewey’s conception of the mind as active makes up
for some of the deficiencies apparent in associationistic psychologists like
Mill.
Concerning the social context of Dewey’s thought, Nagel remarked that
it is worth remembering that Dewey wrote his Logic to help the social
sciences progress at the same pace as the natural sciences. Dewey’s interest
in logic, Nagel said, has been controlled by the apparent profound chaos
in moral and social thought for which Dewey has sought a solution. One
of Dewey’s earliest and most influential papers (1902), “Logical conditions
of a scientific morality,” contains the kernel of most of his logical thought.
Dewey’s general conception is that an appropriate logic should be an
organon for the solution of pressing social problems.
After these preliminary and general remarks, Nagel turned to the
direct consideration of Dewey’s Logic, The Theory of Inquiry. To begin
with, Nagel pointed out, Dewey considers logic as the theory of inquiry and
not as a formal science. Dewey distinguishes between the proximate and
ultimate subject matter of logic. The proximate subject matter is the tradi¬
tional study of the implicative relations between propositions, and so forth.
There is much agreement on this aspect of the subject matter; but there
is disagreement over the ultimate subject matter, for example, about the
basic units of logic. According to Bradley, the basic units are judgments;
others hold that they are terms. For Dewey, the basic unit is a completed
act of inference. Dewey continually argues that inference is needed, because
any natural immediate experience is incomplete in itself, and its relation to
other events involves inference. Dewey’s chief criticism of ancient science
is that it mistook immediate qualities as the efficient causes of things. The
chief virtue of modern science is in overcoming this and being able to ignore
directly experimental qualities as causes. On this view logical theory is
regarded as systematic reflection on inquiry. In Dewey’s opinion, and I
think also in Nagel’s, too often there has been no systematic investigation
of why some inquiries are successful and others are not; the peculiar con-
4 1. SCIENCE AND INQUIRY

ditions that produce success are too seldom understood. These conditions
are precisely the subject matter of logic. For Dewey, if there are no problem
solvers, there is no logic. Put another way, it may be said that for Dewey
logic is equivalent to experimental epistemology. Logic is a positive inquiry
into inquiries, having the methods of inquiries as its specific subject matter.
In this sense logic is descriptive and not normative, but it is also normative
insofar as it sets standards for later inquiries.

MARCH 12

Nagel began by pointing out that a central feature of Dewey’s logical


theory is his claim that logical forms arise inside inquiry and do not char¬
acterize things outside of inquiry. Nagel pointed out that while this con¬
textual concept of logical form is a difficult one for those with a back¬
ground in mathematics and formal logic, it is worthwhile to see what
Dewey has in mind. According to Nagel, the key to understanding what
Dewey means here is to realize that the concept of logical form is not the
same in Dewey’s theory and in formal logic. Dewey holds that certain types
of functions are to be regarded as the logical forms; for example, the
functional relation between evidence and conclusion is not characteristic
of things themselves but only in relation to specific inquiries. Those who
argue that logical forms have the eternity of Platonic forms hardly under¬
stand Dewey’s point. Another way of putting it, Nagel said, is that Dewey’s
logical forms are relations between means and consequences.
Nagel said that Dewey’s logical forms are postulates for inquiry. Dewey
is essentially saying that the distinctions of the proximate subject matter
belong to things only in specific inquiries. In these terms, logic becomes the
formulation of the conditions of successful inquiry.
Nagel remarked that a functional concept of knowledge is sometimes
attributed to Dewey; but unfortunately, the term functional is vague and
needs specification. Nagel said that there are two ways of specifying knowl¬
edge. One way is to say that we have knowledge when we have the truth,
and to define knowledge in terms of its systematic characteristics, not in
terms of how it is acquired. The other way of specifying knowledge is to
say that knowledge is acquired through inquiry and that an essential way of
characterizing knowledge in this sense is to characterize the context of
inquiry. According to the latter view, which is Dewey’s, inquiry begins or
arises from doubt when there is a felt tension. What resolves the doubt or
tension is knowledge, the resolution of the problem. Nagel remarked that
Dewey’s theory of knowledge is functional in two ways. It is functional
insofar as knowledge is construed in terms of the resolution of particular
problems; it is also functional in the sense that knowledge is identified in
terms of a process (the process of inquiry) which can itself be overtly
PATRICK SUPPES 5

located by reference to the behavior of certain organisms. For Dewey,


knowledge simply becomes the terminus of inquiry, and this view rules out
the question of the possibility of knowledge in general, or knowledge apart
from a particular context of inquiry.
For Dewey, inquiry is a transformation of indeterminate constituents into
a unified, determinate whole. Here Nagel quoted Russell’s famous remark
that this characterization would apply to a drill sergeant working with a
group of recruits. Nagel pointed out that one difficulty in Dewey’s theory is
that of determining to what extent the definition of logic proposed by Dewey
is adequate to the traditional problems of philosophy. Dewey seems to
intend his definition as an empirical generalization, and he speaks in such
broad terms that much of the revolutionary aspect of his thought evaporates
when what he says is understood.

MARCH 17

Nagel began by considering Dewey’s conception of the naturalistic char¬


acter of logic. Nagel characterized a theory as naturalistic when a con¬
tinuity is established between it and the biological operations out of which
it grew; but Nagel then asked what this emphasis on continuity contributes
to a logic interested in forms of warranted assertability. What importance
does continuity have in this context? What is the point of raising it?
Dewey’s answer has already been stated; namely, hypotheses cannot be
introduced independent of a context, and it is precisely the introduction
of a context that leads us to the naturalistic analysis of inquiry.
Nagel then raised the particular question of the use of symbols as a
distinctive case. Nagel stated that he failed to see why Dewey’s emphasis
on continuity would change or obscure the usual or traditional account of
the role of symbols in inquiry. Continuity with biological operations, Nagel
stated, seems irrelevant in terms of the specific functions of symbols in
inquiry. Nagel also emphasized that Dewey’s attempt to show the con¬
tinuity between the logical and biological must be regarded as speculative.
The present amount of accurate empirical information is insufficient to
establish the relation. As an example, Nagel discussed ponendo ponens.
Dewey, he said, takes over Peirce’s view of rules of inference as leading
principles; for example, ponendo ponens is a habit men have acquired
which enables them to go from two premises to a conclusion. It is an effi¬
cient habit that is a generalization from experience; but Nagel argued that
whatever biological foundations may underlie the genesis of the principle of
ponendo ponens, its validity may be established independently of biological
and physical interpretations.
As a second example, Nagel considered Dewey’s interpretation of
Aristotelian logic. Nagel said Dewey was right in asserting that the
6 1. SCIENCE AND INQUIRY

syllogistic forms are not fruitful, but wrong in denying the validity of the
syllogism simply because it was developed at an early stage of science.
Nagel then stated that surely there must be some misunderstanding here.
Possibly what he was saying was an incorrect interpretation of Dewey.
Nagel said he was calling attention to the close connection between
genesis and validity for Dewey, and he raised the question of whether a
person who ignored the biological evidence would be a supernaturalist.
In Nagel’s view, surely not.
As a final example, Nagel remarked on Dewey’s treatment of the prin¬
ciple of identity. For Dewey, it is not just the form if p then p, but rather
the carrying through of the term so that it has one meaning. It formulates
a rule for the handling of terms. Nagel said that for Dewey it then becomes
a synthetic principle rather than a tautology.

MARCH 19

Nagel began with some remarks about symbols which, I believe, were
meant to reflect Dewey’s views, although my notes are vague on this point.
Nagel said that symbols are artificial signs. Between natural signs and their
object, there is signification. Between artificial signs or symbols and what
they stand for, there is meaning. We can have clusters of symbols or arti¬
ficial signs, but not clusters of natural signs. For Dewey, all inquiry
essentially involves the use of symbols or artificial signs. This is a point
which much of the criticism of Dewey’s logic seems to have missed.
Nagel then considered the important term situation in Chapter IV of the
Logic. For Dewey it is not possible to define the term situation, for all
definitions are made within situations. There are two points of Dewey’s
discussion that Nagel said were important to note. The first is that percep¬
tion itself occurs always within a situation; the second is the sense in which
perception is cognitive. The central point here is that Dewey denies that
perception per se is cognitive. Isolated acts of perception are for Dewey
not knowledge; they are neither true nor false. Validity or invalidity is
relevant only if we consider the signification of the perceptions. Through¬
out his many discussions, Dewey criticizes the doctrine of immediate knowl¬
edge, that is, the doctrine that direct or immediate knowledge arises from
sensation. Nagel repeated that it is essential to realize that in Dewey’s
logic knowledge can never be the case of simple perception alone. The
reason for Dewey’s claim is clear. Perception is not the outcome of inquiry,
but this is the most important characteristic of knowledge; consequently,
perception is not knowledge.
Relevant to this discussion, Nagel turned to Dewey’s important distinction
between having and knowing. For Dewey, Nagel said, knowing is the
terminus of inquiry, but having is essentially an aesthetic experience.
PATRICK SUPPES 7

Knowledge is capable of being formulated in discourse, but that which is


had is not. Here we can see how perception fits into Dewey’s scheme of
things. Perception is having. Dewey’s criticism of philosophical idealism is
that it does not admit the distinction between having and knowing, and thus,
does not permit us to break through the egocentric circle. For Dewey,
Nagel pointed out, knowing is an instrument of having, but Nagel remarked
that it is perhaps impossible to define clearly what having is for Dewey,
and it is often hard to distinguish between having and knowing. In fact,
Nagel said, he had a certain difficulty in understanding what knowledge
is for Dewey. Even if it is agreed that knowledge is the terminus
of a situation or an inquiry, it is not clear whether this terminus is a
having or a knowing. It seems to be both, but in different respects for
Dewey.
At this point Nagel digressed in order to consider Dewey’s distinction
between scientific and common-sense knowledge. Scientific knowledge has
no reference to the immediate situation. Common sense does, but it is also
vague. Common sense is interested in qualitative differences for ends of use
and for enjoyment. Science is nonqualitative. Science is interested in non-
qualitative differences for purposes of knowledge. On the other hand,
science originates in common sense. In Dewey’s view, the Greeks’ separation
of art from science and their belief in pure reason slowed scientific develop¬
ment tremendously. (Here I suspect that Dewey, like other philosophers,
has been seduced by the traditional Platonic tale. It is scarcely possible to
claim that the deep and important development of mathematical and
observational astronomy by the Greeks represents a theoretical or practical
belief in pure reason as a method of learning about the world. On this point,
Dewey, and perhaps to some extent Nagel, is simply repeating a common
view of Greek science.) For Dewey, the difference between science and
common sense is really social and not logical. It is more or less accidental
that thus far science has tended to concentrate on different problems from
those of common sense. In Dewey’s view, prescientific ideas have held
sway too often and for too long in morals and politics. This has made
for an essential split in common sense that is reflected in philosophy—the
split between pre- and postscientific thinking. Nagel concluded this digres¬
sion by pointing out that Dewey’s central point is to emphasize the funda¬
mental unity of the two kinds of inquiry and not to sanction any absolute
difference between them.
Nagel concluded this lecture with a brief discussion of Chapter V of the
Logic, which deals with the needed reform of logic. Nagel mainly reviewed
Dewey’s claims as to why a reform is needed, that is, why classical logic
no longer applies or is appropriate. First, classical logic is qualitative only.
Second, Greek science, the context in which classical logic arose, asserted
heterogeneity of substance and motion, whereas ours asserts homogeneity
of the two. Third, in Greek science all quantifications were accidental rather
8 1. SCIENCE AND INQUIRY

than essential. Fourth, relations were in general also accidental, and now
they are the prime subject matter of science. Dewey wants modern logic
to serve present science and culture in the way that Aristotle served the
science and culture of his own time. (It is not clear to me to what extent
Dewey realized that Aristotle’s logic was in no sense adequate to Greek
mathematics or astronomy. In any case, the thrust of this chapter seems
to be one of the weaker parts of the Logic, and, like Nagel, we can
quickly dispense with it here.)

MARCH 24

In this lecture Nagel turned to Part II of the Logic, which deals with
the structure of inquiry and the construction of judgments. Nagel em¬
phasized that we cannot hope to get the sense of what Dewey is saying by
examining minutely any specific argument. Nagel said that Dewey’s def¬
inition of inquiry illustrates this point perfectly. Russell’s criticism of Dewey
is well justified if we take strictly the single italicized statement of Dewey
in which he defines inquiry (pp. 105-106): “Inquiry is the controlled or
directed transformation of an indeterminate situation into one that is so
determinate in its constituent distinctions and relations as to convert the
elements of the original situation into a unified whole.” Nagel’s point is that
it would never do to take seriously for careful analysis this single italicized
definition of inquiry. What we have to do is to read the many passages
in which Dewey discusses inquiry and put together for ourselves a more or
less inductively constructed picture of what Dewey means by inquiry.
Nagel then said he would try to give some feeling for how this might be
done. He began by saying that the situation with which inquiry begins is
indeterminate. The constituents of the situation do not hang together. The
organism wants to know in what way they do not hang together, for surely
in other ways, they do. Nagel said that apparently the problem is raised in
the mind of the inquirer, because he cannot understand the structures of
the various parts of the situation. In this sense, the situation is doubtful
or indeterminate. It is important to emphasize that for Dewey this sense
is not merely a subjective one. Complete determination does not hold for
the environment. Dewey says that evidence in the physical sciences about
indeterminancy of physical events is evidence that the sense is not purely
subjective. Nagel remarked that this argument is questionable, but that the
reasons were too complicated to discuss at this point. Here Nagel mentioned
Dewey’s lecture on time two years before (1945). Nagel’s version of the
lecture was that Dewey held that time and individuality are connected.
Individuality is development in time, and Dewey claimed that to some
extent his view was derived from statistical mechanics. Nagel asserted that
using such evidence from physics to discuss human individuality was, in his
PATRICK SUPPES 9

view, bad philosophy. He said he would agree that the came for doubt can
be outside the body of the doubter, but this does not make the situation
itself doubtful. Nagel asserted that, as far as he could see, the element of
indeterminancy exists wholly, or almost wholly, in the observer. (The
sophisticated thing which Dewey might have attempted is a biological
interpretation of subjective probability. It seems to me that this is the best
line along which to claim that indeterminancy is more than subjective, or
at least to take the sting out of the claim that it is subjective.)

MARCH 26

The discussion of Dewey’s conception of inquiry was continued. Nagel


said that he would assume that what Dewey meant by an indeterminate
situation was not enough. Nagel went on to the second main point, namely,
that the existence of the indeterminate situation is not enough; the further
step of formulating the problem must be taken. Formulation of a problem
involves for Dewey partial transformation of the situation itself. Nagel
emphasized that Dewey seems to use the concept of transformation to mean
an actual physical change in the situation.
Nagel then examined more carefully Dewey’s notion of a problem. He
first emphasized that for Dewey a problem is something that must have
a possible solution. For Dewey the statement of a probable or indeterminate
situation as a problem has meaning only if in the very terms of statement
there is a possibility of solution. Nagel said that this may be Dewey’s
version of the verifiability theory of meaning. At this point Nagel reviewed
briefly some standard versions of the verifiability theory. Nagel felt that,
in spite of Dewey’s vagueness, his approach had a certain body and
fullness on these matters that the positivistic version of the verifiability
theory of meaning lacked.
Nagel then commented on a number of related points. For Dewey, he
noted, factual conditions determine probable conditions. There need to
be settled facts or constituents in a situation. Nagel said that Dewey’s
affirmation of the need for settled facts raises the previous question as to
what is meant by doubt in the situation as a whole. Nagel asked what
Dewey regarded as the facts. The answer seems to be, he said, that the
facts are constituents in the situation that can be noted through perception.
But he commented that later Dewey says that a subject (in relation to a
predicate) always consists of those perceptual elements that help to identify
the problem. Nagel remarked that this leads to an ideational element. For
Dewey, ideas are anticipated outcomes of possible solutions. The terms
of a problem, or the boundary conditions (to use a terminology not used by
Dewey), are facts. Ideas, in contrast, are possible solutions of problems.
Nagel said that this “in a nutshell” is Dewey’s theory of judgment, which
10 1. SCIENCE AND INQUIRY

he would discuss in greater detail later. To support this analysis from the
text, Nagel quoted the following passage from page 111 of the Logic.

In logical fact, perceptual and conceptual materials are instituted in functional


correlativity with each other, in such a manner that the former locates and
describes the problem while the latter represents a possible method of solution.
Both are determinations in and by inquiry of the original problematic situation
whose pervasive quality controls their institution and their contents. Both are
finally checked by their capacity to work together to introduce a resolved unified
situation. As distinctions they represent logical divisions of labor.

Nagel noted that the distinction between perceptual and conceptual materials
made in this passage is perhaps the basic distinction for Dewey. The funda¬
mental thing for Dewey is that there is no dualism of perception and
conception but rather a functional differentation. Dewey’s theory of inquiry
is an attempt to show how two different processes that are widely separated
in traditional philosophy and epistemological analysis are but two different
functions within a single framework of inquiry.
Nagel reviewed the steps in inquiry as Dewey conceived them. We begin
with an indeterminate situation. We transform that situation into the fixing
of a specific or definite problem. The next step is the introduction of
reasoning or symbolic operations in order to determine the acceptability
of the possible solutions. What is important about Dewey’s conception of
reasoning, which we shall examine in more detail later, is that it is always
an intermediate operation. A mathematical investigation or, it would seem,
any purely theoretical work is always a phase of inquiry and cannot be
the whole of inquiry. The reason for this is that a purely theoretical study
does not constitute the formulation of a problem in terms of observation,
nor apparently in terms of the explicit and immediate feeling of in¬
determinacy of an actual physical situation. Nagel emphasized that in mak¬
ing these observations he had been supposing that the conception of factual
material does not include the observation of symbols as carriers of meaning.
It is apparent that from a formal standpoint this would be a way of con¬
ceptualizing a theoretical investigation as a complete inquiry, and in my
own judgment, it would be very much in the spirit of Dewey’s general
philosophy to give such a radically empirical account of pure mathematics.
However, it should be emphasized that Nagel did not support this view,
nor is it easy to support it by any direct remarks on Dewey’s part.
Nagel concluded the lecture by noting that it seems appropriate to assign
mathematics to the immediate operation of reasoning. He noted that
Dewey’s use of the word problem is more restricted than the ordinary use
and does not include such things as mathematical problems. In this dis¬
cussion of mathematics, Nagel referred to page 405 of the Logic. Several of
the passages there do suggest to me that a radically empirical interpretation
of mathematical activity would perhaps provide a better reading of Dewey
than a more conventional one. I quote here the passage of most interest.
PATRICK SUPPES 11

In its early history, problems of strictly existential subject-matter provided


the occasion for mathematical conceptions and processes as means of resolving
them. As mathematics developed, the problems were set by mathematical ma¬
terial as that itself stood at the given time. There is no contradiction between
the conceptual, non-existential nature of mathematical contents and the existen¬
tial status of mathematical subject-matter at any given time and place. For the
latter is an historical product and an historical fact. The subject-matter as it is
at a given time is the relatively “given.” Its existing state occasions, when it
is investigated, problems whose solution leads to a reconstruction. Were there
no inconsistencies or gaps in the constituents of the “given” subject-matter,
mathematics would not be a going concern but something finished, ended.
As was intimated in an earlier context, material means and procedural
means operate conjugately with each other. Now there are material means,
having functionally the status of data, in mathematics in spite of their non-
existential character. They constitute the “elements” or “entities” to which
rules of operation apply, while the rules have the function of procedural
means. For example, in the equation 2 + 3 = 5, 2 and 3 are elements operated
upon, while + and = are operations performed. There is no inconsistency in
the identity between the logical function of existential data and mathematical
elements or entities and the strictly non-existential character of the latter.

APRIL 9

Nagel continued his discussion of the pattern of inquiry. He noted that


for Dewey the facts are operational in inquiry in the sense that they are
selected in connection with a particular problem, to test particular solutions
of that problem. Ideas are also operational in inquiry because it is their
function to direct inquiry. For Dewey the facts interact with one another,
but Nagel was skeptical that this was an appropriate way to talk about
facts. As an illustration he mentioned that from the observation of finger¬
prints we infer the man was present at the scene of the crime. Is it ap¬
propriate to say that the two facts interact? Nagel said he would prefer to
talk about our organization of the facts.
Referring to pages 118-119, Nagel said that an important distinction for
Dewey between content and object of inquiry had been misunderstood by
some of his critics, who had taken Dewey as holding that inquiry creates its
subject matter. As Nagel conceived it, the distinction Dewey intended was
between subject matter undergoing inquiry and the subject matter of a
completed inquiry. Dewey wants to use the term subject matter for material
undergoing inquiry; he specifically calls this content. The outcome of an
inquiry, which he sometimes calls the outcome of subject matter, is what
he means by an object. Because this language sounds a little unusual, it may
be useful to give the explicit quotations from Dewey on page 119.

When it is necessary to refer to subject-matter in the context of either


observation or ideation, the name content will be used, and, particularly on
account of its representative character, content of propositions.
12 1. SCIENCE AND INQUIRY

The name objects will be reserved for subject matter so far as it has been
produced and ordered in settled form by means of inquiry; proleptically, ob¬
jects are the objectives of inquiry.
Nagel then turned to the explicit discussion of the nature of judgment
for Dewey in Chapter VII of the Logic. Nagel’s first point was that for
Dewey judgment is not inquiry itself but the settled outcome of it. Thus,
judgment is never found in inquiry except on a partial basis. Secondly, judg¬
ments are to be distinguished from propositions, which do not have direct
existential import. For Dewey a judgment issues in existential consequences.
It is a decisive directive for future activity, but the judgment itself is not
the set of activities that ensue. It has a propositional form, and thus in
general, its truth or falsity can be investigated. Nagel indicated that for him
this analysis presents a difficulty. Knowledge is defined by Dewey as the
terminus of inquiry; but this characterization of judgment indicates that
knowledge is, strictly speaking, not the terminus of inquiry, for following
the assertion of a judgment a further transformation takes place, namely,
the directive for future activity contained in the judgment. (Nagel did not
discuss the distinction between judgments as imperatives or directives on
the one hand and as true or false indicative sentences on the other, but
the reasons for not pursuing this kind of distinction are clear to anyone
who has perused, even cursorily, the Logic.)
Nagel noted that for Dewey a final judgment is always individual in
character. Nagel remarked, in a characteristic vein of his own, “This
is a hard saying,” for many judgments have the form of universal proposi¬
tions. Nagel raised the question of how we reconcile Dewey’s assertion that
judgments are individual in character with what appear to be the patent
facts in many inquiries. Nagel remarked that one way of looking at the
matter is that many inquiries by theoretical scientists must be regarded, as
previously remarked, only as partial inquiries. Secondly, he said that it was
necessary here to make Dewey’s distinction between singular and individual.
For Dewey, singulars are named by demonstratives such as this or that.
Propositions about singulars occur during the course of inquiry, but the
final judgment is about the total situation, which is individual. (It seems
clear that Dewey’s views on these matters have been influenced by the way
we think of legal judgments of a court, which, almost always, do deal with
an individual situation and are not expressed in terms of universals. This
particular interpretation was not discussed by Nagel.)
In this discussion of judgment Nagel emphasized the overweening im¬
portance for Dewey of practical activity in contrast to theoretical investiga¬
tions. Nagel said he would interpret Dewey as holding that the scientific
inquiries of a mathematical physicist would never be regarded as a com¬
pleted inquiry, at least not until they were used in some aspect of practice
or at least in direct, experimental investigation.
Nagel then turned to Dewey’s view that every judgment has the logical
PATRICK SUPPES 13

form of subject, predicate, and copula. (That Dewey held this view is sub¬
stantiated by many passages in Chapter VII.) Nagel raised the question
whether Dewey was ignoring the central fact of modern logic that many
propositions are relational in form. Dewey, he said, seemed not to be
ignoring the modern logical viewpoint, but to be distinguishing between
propositions which are not true or false, but more or less useful, and
judgments which are true or false. Dewey seems to take the view that
propositions are guides, maps, or blueprints; they are, in other words, means
for making more effective judgments. In certain respects, Nagel said,
Dewey’s view seems to be close to Wittgenstein’s verifiability theory of
meaning. For Dewey only those things are fully warrantable, that is, true
or false, which are individual in character. General propositions do not
have this property but are useful only as guides. However, it should be
emphasized that for Dewey a judgment is not warranted by a simple
perception. Nagel pointed out that if Dewey’s view is taken seriously,
then it is not only the concept of truth for general propositions that is
endangered but the whole concept of confirmation and scientific testing
of theories, in the sense of determining the extent to which evidence con¬
firms a theory or, in ordinary terms, the extent to which the theory seems
to hold for relevant phenomena. This has the consequence that there is no
direct question of induction for judgments because judgments are always
particulars. They are not related in functional fashion to general proposi¬
tions. As a result, induction holds only for intermediate stages of inquiry
and not for the terminus or final judgment of inquiry.

APRIL 14

In this lecture, on which my notes are rather brief, Nagel turned to a


more detailed discussion of Dewey’s analysis of the subject, predicate, and
copula, that is, of the constituents of judgments. As already remarked, for
Dewey a subject always has a subject-predicate form. Nagel stressed
that Dewey’s emphasis on the form of judgments helps to bring out in what
sense the logical form accrues to a subject matter. Something is a subject
matter not because of its ontological character but because of its logical
character, which amounts to a rejection of Aristotelian substance. Contrary
to many classical epistemological views, for Dewey a fact functions only as
a part of inquiry. To say that something is a subject is to say that a fact
has acquired a logical character it did not have previously. In this connection
Nagel cited a significant passage from Dewey (pp. 128-129).
The condition—and the sole condition that has to be satisfied in order that
there may be substantiality, is that certain qualifications hang together as de¬
pendable signs that certain consequences will follow when certain interactions
take place. This is what is meant when it is said that substantiality is a logical,
not a primary ontological determination.
14 1. SCIENCE AND INQUIRY

Certainly these views on the problem of substance are consistent with


Dewey’s general position.
Nagel then emphasized Dewey’s view that predicates provide a method
of solution and do not themselves constitute a solution. I find my notes on
this discussion very thin. I suspect that I probably did not understand very
well what Nagel was saying. In a modern vein, it seems appropriate to me
to say that when Dewey talks about predicates as methods of solution, he
is emphasizing the use of language for purposes of communication in
inquiry and here, as always, playing down the use of language for the bare
expression of fact.
Nagel then turned to Dewey’s view that the copula expresses a functional
correspondence of the subject and predicate. In particular, the copula
expresses, as Dewey puts it,

the act or operation of “subjection”; that is, of constituting the subject. It is a


name for the complex of operations by means of which (a) certain existences
are restrictively selected to delimit a problem and provide evidential testing
material, and by which (b) certain conceptual meanings, ideas, hypotheses, are
used as characterizing predicates. It is a name for the functional correspondence
between subject and predicate in their relation to each other. The operations
which it expresses distinguish and relate at the same time [pp. 132-133].

After completing this discussion of the copula as such, Nagel emphasized


Dewey’s view that a judgment is a temporal affair; or, as Dewey puts it,
“that judgment is a process of temporal existential reconstitution.” Most
philosophical readers will be troubled by this obscure phrase, but the idea
that judgments are temporal occurrences does in itself tie in with recent
discussions of utterances and statements. In any case, Nagel summarized
Dewey’s position as amounting to the view that a judgment consists in overt
transformation of the situation or in the actual doing of something, rather
than simply in the assertion of a proposition.

APRIL 16

In this lecture, Nagel turned to an analysis of Chapter IX, “Judgments


of Practice: Evaluation.” Nagel began by pointing out that declarative
propositions are only intermediate and instrumental. They are statements
of what conditions do exist. In contrast, judgments of practice contain
overt transformations, of the sort to which the classical theory of judgment
often denies status. The classical theory asserts that even in the case of a
practical judgment there is in the judgment itself no overt transformation.
Nagel mentioned that Dewey attempts to separate the linguistic situation
from the psychological one. The mere existence of a practical syllogism
does not in itself indicate the presence of a judgment of practice. If this
syllogism is a matter of habit on the part of the user, then no issue of
PATRICK SUPPES 15

judgment arises. It is only when there is a question of doubt and inquiry


that a genuine judgment of practice is made.
Nagel pointed out that if we accept this account, then many judgments
ordinarily taken as final must be classed as instrumental. A judgment of
practice for Dewey always involves a transformation of antecedent con¬
ditions.
In this connection, Dewey defends the position (pp. 167-168) that
moral evaluations have the character he has ascribed to judgments of
practice and are not, as often thought traditionally, predetermined and
given ends in themselves. What Dewey has to say about ethics is often
better and clearer than his comments on other topics. The following passage
puts his view very well.

The notion that a moral judgment merely apprehends and enunciates some
predetermined end-in-itself is, in fact, but a way of denying the need for and
existence of genuine moral judgments. For according to this notion there is no
situation which is problematic. There is only a person who is in a state of
subjective moral uncertainty or ignorance. His business, in that case, is not to
judge the objective situation in order to determine what course of action is re¬
quired in order that it may be transformed into one that is morally satisfactory
and right, but simply to come into intellectual possession of a predetermined
end-in-itself [p. 168].

At the end of this lecture Nagel remarked that a source of much mis¬
understanding of Dewey in his discussion of practical judgments, and in
particular, moral judgments, is Dewey’s distinction between evaluating
and valuing. The distinction for Dewey is parallel to that between having
and knowing. It is the distinction between having a value experience and
evaluating that experience.

APRIL 23

In this lecture Nagel discussed Chapter X, “Affirmation and Negation:


Judgment as Requalification.” In this chapter Dewey considers such matters
as his view of the traditional A, E, I, and O propositions. Nagel began by
emphasizing, once again, that declarative propositions are not simply
enunciatory for Dewey, but are means leading to solutions. It is in this
light that Dewey’s discussion of affirmative and negative judgments must
be viewed. Nagel recalled that the concept of affirmative and negative
judgments was introduced by Aristotle in the Prior Analytics, and is
ordinarily regarded as a grammatical distinction; but for Dewey the basis
of the distinction is not grammatical, but is rather a difference in function.
Affirmative propositions represent agreement of subject matters in their
evidential capacities. Negative propositions represent subject matters to be
eliminated because of their irrelevancy or indifference to the evidential func-
16 1. SCIENCE AND INQUIRY

tion of material in the solution of a given problem. In this connection Nagel


remarked that it is rather difficult to find statements that are negative in
Dewey’s terms, but affirmative in grammatical form. In any case, Nagel
emphasized that for Dewey both forms of proposition have an intrinsic
connection with change. Dewey does not mean that every such proposition
reports a change, but that affirmative or negative propositions are logically
grounded in the exclusion of alternatives.
Nagel then turned to Dewey’s interpretation of the traditional square of
opposition. As might be expected, Dewey emphasizes that the traditional
relations of contrariety, subcontrariety, and contradiction have to be under¬
stood in functional and not in formal or mechanical terms. What is
inadmissable for Dewey is the interpretation of propositions as independent
sets of objects to be considered by and of themselves and in their relation to
each other. For Dewey the traditional opposition of contraries is to be inter¬
preted in terms of setting limits within which specific determinations must
fall. His concrete example is that what we know about marine vertebrates
must fall between the A proposition “All marine vertebrates are cold¬
blooded,” and the E proposition “No marine vertebrates are cold-blooded.”
For Dewey these contrary propositions cannot represent conclusions or the
terminus of an inquiry, but are the results of a preliminary survey. In
interpreting this view of Dewey’s, it is important always to remember that
for him the terminus of inquiry must be an individual judgment, not a
general proposition. In terms of this general view it is quite clear and
natural to maintain that A and E propositions cannot themselves be the
terminus of inquiry. In this connection, Nagel made the point that in
experimental inquiries it is usually not sufficient to operate with the bare
formality of A and E propositions as setting natural limits of inquiry.
Usually much more detailed and more specific information is available. For
example, in investigating the temperatures of marine vertebrates we would
not ordinarily look at the contraries “All marine vertebrates have a
temperature of 60°F.” and “No marine vertebrates have a temperature of
60° F.” We would rather investigate a range of temperatures and would
have an entirely different formulation of the relevant propositions to be
tested. At this point Nagel said that he really did not know what Dewey
would have said to this objection.
For Dewey, subcontraries are more determinate than contraries but are
still indeterminate compared with the individuality of final judgment. In
Dewey’s view, subcontraries are used only if they are determinate, that is,
if they have some ground for support. At this point Nagel considered
Russell’s famous analysis of the sentence “The present King of France is
bald,” and he said that in his judgment it was better to make explicit what
you mean by proposition before you consider contraries and subcontraries
in relation to factual data or material. Dewey seems to hold that the mean-
PATRICK SUPPES 17

ing of propositions is determined by factual data, and this Nagel felt should
be avoided.
Nagel next turned to Dewey’s discussion of contradictories. He pointed
out that for Dewey a universal or general proposition is negated not by
the indeterminate “some” but by the determinate singular. Nagel admitted
that Dewey is correct in terms of much actual practice and inquiry. General
propositions are denied by the use of a singular proposition and not merely
by an existential statement. Nagel made the point, however, that there is a
standard, broader usage. It is possible to establish a particular or existential
proposition without using singular propositions; insofar as this must be
accepted as established doctrine or procedure, Dewey’s position must be
qualified. My notes do not show that Nagel discussed this point in detail.
Finitistic or constructive positions about the foundations of mathematics
would seem to offer specific support of Dewey’s position.

APRIL 28

In this lecture Nagel turned to Chapter XI, “The Function of Proposi¬


tions of Quantity in Judgment.” Nagel began by noting the traditional dis¬
tinction between universal and particular propositions, and emphasized
Dewey’s criticism that this classical Aristotelian distinction is too restric¬
tive for modem science. Dewey’s point is that in the context of modern
quantitative science, the qualitative distinction between all and some is so
crude and general as almost to be irrelevant. Nagel pointed out that Dewey
did not have a very thorough understanding of the role of quantifiers in
modern logic and of their relation to the construction of the real number
system on the foundations of set theory. He gave as an example the way
in which quantifiers may be used to express the sentence that there is exactly
one mayor of New York City. What Nagel said is certainly correct and
sound. It can be said on Dewey’s behalf that his remarks were aimed
really at the kind of applications of Aristotelian logic to scientific matters to
be found in Aristotle and not to the formal doctrine itself. It is certainly
true that much of the discussion in Aristotle and in particular those ap¬
plications that hew as closely as possible to the line of his logic are not
scientifically very deep or satisfactory.
Nagel noted that for Dewey distinctions between kinds of quantity are
for functional purposes, and measurements are instrumental always in
relation to certain aims. Throughout this chapter Dewey emphasizes this
instrumental character of measurement in contrast to what he likes to call
the mistaken cosmological and ontological framework of Aristotle.
Nagel then turned to the discussion of some of Dewey’s constructive
views about measurement. One of the first things to note in reading the
18 1. SCIENCE AND INQUIRY

chapter is that Dewey generalizes the common conception of measure to


include any comparison. In other words, for Dewey it is not necessary to
think of measurement as assigning numbers to objects. An example would
be the specification that liquid A is denser than B when B will float on A.
It is also a characteristic thesis of Dewey’s that there is no fundamental
antagonism between quantitative and qualitative distinctions. For Dewey
there is an underlying qualitative continuum at the basis of all quantitative
measurement. For example, in the measurement of length we assume the
quality of spatial extension. What we do in measurement is to ignore cer¬
tain qualities and to take others into account.
Against the critics of quantification, Dewey says that we have enhanced
the control of the emergence of certain qualities by the introduction of
quantitative techniques. The objection to measurement raised by many
philosophers is not a sound one. Here, Dewey had in mind the objection
that the reduction of a subject matter to numerical measurements made it
dead and bare. As Dewey emphasizes throughout the chapter, the develop¬
ment of modern science renders that thesis rather ridiculous.
At the end of this lecture Nagel digressed to reject the Kantian view
that measurement always involves space. I still have a vivid memory
of this discussion. What Nagel had to say about this Kantian view seemed
to me to exemplify just the kind of thing a philosopher should be saying
and doing. Nagel considered a number of examples from the physical
sciences of measurement and asked, in each case, whether or not it was
possible to dispense with spatial extension in measurement. After examina¬
tion of a number of cases, he argued that it was not always required that
spatial concepts be involved in measurement and he cited as the primary
instance the measurement of time. Unfortunately, my notes do not show
the detailed remarks Nagel made about the measurement of time, but
in any case it would be a digression to enlarge upon this point here.

APRIL 30

This lecture was mainly devoted to Chapter XIII, “The Continuum of


Judgment: General Propositions.” Nagel first discussed Dewey’s analysis of
the types of existential propositions. There are first particular propositions.
These are propositions which qualify a singular. An example would be the
proposition “This is sweet.” Singular propositions, on the other hand, deter¬
mine a singular as one of a kind, for example, the proposition “This is a
dog.” Nagel pointed out that for Dewey the notion of kind is important
and is to a large extent borrowed from earlier logicians, particularly Mill.
For Dewey the presence of a kind means the co-occurrence of traits, so
that one can serve as the sign for the others. As has already been noted
in earlier discussion, in Dewey’s hierarchy of propositions, singular proposi-
PATRICK SUPPES 19

tions have a more complicated function and a more central position in


inquiry than do particular propositions. Their relation to judgments and
determination of inquiry has already been noted.
Nagel next turned to Dewey’s concept of generic propositions, which are
a species of general proposition. They are primarily propositions about
relationships of kinds. What is recurrent in generic propositions is the
power of certain qualities to serve as signs. For Dewey it is never the im¬
mediate qualities of things which are general, but the mode of using signs.
Nagel pointed out that this is a point of consistency in Dewey. It char¬
acterizes his effort to show distinctions as logical rather than ontological,
and to show that logic arises and operates within inquiry. Thus, generality
arises from a certain habit that has been established so that an individual
acts in a certain way when confronted with signs of a particular kind. Nagel
contrasted this view with the traditional assertion that universal arise from
certain kinds of signs that stimulate certain responses. Nagel pointed out
that instead of saying that a particular color red is a universal because the
redness recurs, Dewey says we can claim individuals react a given way to
this stimulus because of certain habits. Nagel pointed out, however, that if
we say that the recurrence of the habit is due to a common element we
have pushed the analysis back only a very small step. For Dewey this
would be a matter of generating a new inquiry. Nagel said Dewey could
always reply that we need not commit ourselves to a Platonic idealism,
but can always say that signs merely function for other things and that
we need no Platonic universal for taking care of the facts. Thus, for Dewey,
universality consists not inherently in the thing itself but in the mode of
response. Pursuing this line of thought, Nagel noted that the inquiry into
the response itself requires a response, and thus we have an infinite
hierarchy of responses. On the other hand, it is not a vicious regress since
for knowledge we do not need to know all levels. As an example of this,
Nagel mentioned an animal exposed to an auditory stimulus. The animal
is able to respond in appropriate fashion and yet is not able to conduct
inquiry at a deeper level.

MAY 5

This lecture was devoted to Chapter XIV, “Generic and Universal Prop¬
ositions.” I find my notes on this lecture to be rather unsatisfactory, and
in an effort to bring them into better order, I have read Chapter XIV once
again. I must confess that I find Dewey unusually obtuse in this chapter.
The whole discussion is at a level of such vague generality that it is difficult
to pin down and evaluate his central theses. It seems fair to say that in this
lecture Nagel was dealing with very recalcitrant material.
Nagel began by pointing out that the distinguishing feature of generic
20 1. SCIENCE AND INQUIRY

propositions for Dewey is that, unlike universal propositions, they deal


directly with that which is existential. Universal propositions, on the
other hand, connect attributes in a necessary fashion, necessary at least
in the context of inquiry. In a characteristic turn of phrase, Dewey says
that universal propositions are modes of action or possible ways of acting.
The possibility is expressed by the traditional if-then form. A universal
proposition tells us the conditions that inquiry aims to introduce. In this
respect, universal propositions are in some sense definitional in character.
They are definitional in that they constitute an analysis of a concept into
its constituent elements. As an example, Nagel gave Newton’s law of
gravity. It is a universal proposition; and insofar as it expresses a universal
and necessary property of material bodies, it is also a partial definition
of material bodies. Here, I think, Nagel was saying something that was
clearer as a doctrine in his lectures than is the corresponding set of ideas
in Dewey.
Nagel mentioned that for Dewey there are two types of universal prop¬
ositions. He referred to Chapter XX, on mathematical discourse, and in
particular to page 397. One type of universal proposition is exemplified
by physical laws and the other by propositions of mathematics. Nagel
then asked what these two types of universal propositions have in common.
Dewey’s answer seems to be that both are in some sense definitional in
character. On the other hand, Nagel pointed out that Dewey distinguishes
different kinds of physical laws, some being generic and some universal.
An example of a generic physical law would be the proposition that all
whales are mammals. This proposition is generic because it asserts a
relation of kinds, more explicitly, an existential connection between kinds.
Nagel pointed out that for Dewey the logical status of a generic proposi¬
tion is that of an I or O proposition, an essentially contingent type of
proposition. A universal proposition, on the other hand, is a necessary
proposition which is not capable of being refuted by experience, but may
be abandoned in the light of inquiry.
There is no doubt that Dewey intends to deal with some traditional
logical distinctions in Chapter XIV, but it is difficult to be very sympathetic
with his enterprise, or to believe that he is making distinctions of much
contemporary use in this chapter. Even if the distinctions are significant,
the maddeningly vague and muddled way in which he discusses them
makes it hard to take him seriously.

MAY 7

Nagel continued the discussion of generic and universal propositions.


He said that he would like to give his own explanation of how universal
propositions are intended to function in Dewey’s system, by referring
PATRICK SUPPES 21

to Peirce’s well-known distinction between premises and leading principles.


He made the point that leading principles provide a means of making a
transition from premises to conclusion, but also avoid the infinite, regress
illustrated by Lewis Carroll’s parable of the tortoise and Achilles. He spent
some time in discussing the tale of the tortoise and Achilles in order to
bring out the necessity of having some rules of inference. He outlined
the situation here with great clarity, but I shall not summarize his presenta¬
tion, because of its general familiarity as an example illustrating the need
for rules of inference.
Nagel then pointed out that sometimes premises can be converted
into leading principles. He gave as an example the following syllogism:
All A is B, all C is A, therefore all C is B. We could change this by
converting the first premise into a leading principle and then having
only the single premise, all C is A.
Nagel then remarked that this discussion of Peirce’s conception of
leading principles is germane to Dewey’s distinction between generic
and universal propositions, because for Dewey universal propositions
function primarily as leading principles in inquiry. Universal propositions
formulate the kind of operations that are used to determine the sets of
traits common to a kind. Nagel then considered some examples, contrasting
mathematical and physical propositions in the discussion.
Nagel first asserted that it is in one sense patently false to say that
Newton’s second law is as necessary as 2 + 2 = 4. For the denial of
Newton’s second law will not lead to a contradiction, as in the case of
2 + 2 = 4. On the other hand, he would agree with Dewey that universals
are not used for descriptions of matters of fact. Nagel then showed how
Newton’s laws could be used as leading principles. He referred to Carnap’s
Logical Syntax of Language and said he would follow Carnap’s distinc¬
tion between premises, rules of inference, and conclusions. He pointed
out that rules of inference are usually regarded as logical rules, but
there was no reason that there could not also be rules of physical inference.
Newton’s second law could be taken as such a rule of physical inference.
He gave as an example the standard formula for computing how far a
body has fallen in t seconds. The formula, s = Vigt2, acts as a principle
of physical inference. We are given the premise that the body has fallen
for 4 seconds. Our conclusion is to state how far it has fallen. It is easy
enough to convert the formula, s = !^gt2, into a rule: Given the numerical
value of the time, we calculate the distance by multiplying the time by
itself, multiplying that result by the constant g, and dividing by 2. The
resulting number is the number of feet that the body has fallen.
Nagel pointed out that such physical rules of inference not only deter¬
mine physical consequences, as in the example just discussed, but also in
part determine the meanings of terms employed in the premises and in con¬
clusions. According to Nagel, Dewey is suggesting that universal proposi-
22 1. SCIENCE AND INQUIRY

tions of physics, for example, are necessary in the sense that the meanings
of terms that we use in investigations are partly fixed by these propositions.
If the universal propositions are abandoned, then the meanings of terms
necessarily change.
Nagel said that he did see a difficulty in the sense that what Dewey
calls universal propositions are in fact often used as premises and not
as leading principles. Nagel did say that it is possible to look at the matter
functionally and to claim that a given proposition can function in some
contexts as a premise and in others as a rule of inference. Nagel said that
he would put the matter this way: it would be possible to introduce a
greater degree of “relativization,” which would permit us to use a prop¬
osition either as a premise or as a leading principle of inference.
Nagel concluded the lecture by discussing some of the advantages of
regarding universal propositions as leading principles. In the first place
we test them only insofar as they are useful. For the pragmatist and
especially for Dewey, theories of science are simply the means of getting
from one set of singular propositions to another. Nagel admitted that
initially this viewpoint does seem to clear the air. We can then assert that
physical laws do not necessarily reflect the structure of things in the
universe but simply provide tools for getting from one singular proposi¬
tion to another. Nagel did say that he felt there were difficulties in this
view as well, but there was not time to pursue them on this occasion.

MAY 12

In this lecture Nagel dealt with that part of Chapter XVII concerned
with what are often called the laws of thought or formal canons of logic.
Traditionally, this discussion has centered around the principles of identity,
contradiction, and excluded middle, and Dewey discusses each of these
principles at the end of the chapter.
Nagel began by saying that the function of laws of thought or formal
canons is to state the ultimate conditions which propositions must satisfy
to function properly in inquiry. He turned then to the discussion of the
first principle, that of identity. He remarked that this principle is not
to be found in Aristotle, and that probably the first explicit formulation
is found in Leibniz. Traditionally the principle expresses an ultimate
condition on any subject matter, but for Dewey it expresses something
different. He here cited Dewey’s own formulation (p. 344) that the prin¬
ciple is “the logical requirement that meanings be stable in the inquiry-
continuum.” Nagel remarked that this interpretation of Dewey’s is
obviously different from that given by realistic logicians (realistic is
here, of course, an ontological term).
Nagel next turned to the principle of contradiction. He stated that for
PATRICK SUPPES 23

Dewey this principle sets the ground of complete exclusion. It is “a con¬


dition to be satisfied.” Put another way, Nagel said that for Dewey the
principle states a condition that propositions must satisfy to be used in
inquiry; thus it says nothing ontologically.
Next, Nagel turned to the principle of excluded middle. He formulated
the principle, both in the form that everything has either the property A
or not A, and also in the form p or not p. He quoted Dewey’s own
remark on page 346: The principle “presents the completely generalized
formulation of conjunctive-disjunctive functions in their conjugate relation.”
Nagel also emphasized that for Dewey the principle is a logical condition
to be satisfied. It is a directive for making definitions.
Discussing the three principles together, Nagel then emphasized that
for Dewey the three principles have no ontological status, but quoting
Dewey, page 346, “as formulations of formal conditions (conjunctive-
disjunctive) to be satisfied, they are valid as directive principles, as regula¬
tive limiting ideals of inquiry.” In this connection Dewey discusses the
classical objection to the law of excluded middle, that the law does not
apply to changing relations. Nagel affirmed that he felt Dewey’s answer was
wholly sound. The principles are meaningless for changing relations unless
they are considered as conditions to be satisfied.
Nagel concluded by pointing out that it is essential for Dewey’s logic
that these laws are not descriptive of traits that exist outside of inquiry,
but play a logical role within the context of inquiry. He mentioned that
a predecessor of Dewey in this line of thought was F. C. S. Schiller.
It should be mentioned that Nagel spent some time explaining the
standard logical formulations of each of the three principles, and I have
omitted that material here.

MAY 14

In this lecture, the last one of the term, Nagel began by discussing
Dewey’s views on induction. He stated that Dewey’s views on induction
and more generally on scientific method were fairly orthodox, but as he
would show later in the lecture his views on causality were not. Nagel
remarked that Dewey is concerned to show that a given sample or class
has representative connections for the whole. This is one formulation of
a classical problem of induction. The difficulty, of course, is to know
how to state the criterion of representativeness. Nagel pointed out that
beyond stating the problem correctly, Dewey says little more. For instance,
at no point does he discuss, even at an elementary level, general principles
of statistical inference. Nagel remarked that it was disappointing to find
so little specific discussion in Dewey, considering the definiteness of much
of the literature on induction.
24 1. SCIENCE AND INQUIRY

Nagel then turned to a discussion of Chapter XXII, “Scientific Laws—


Causation and Sequences.” He first pointed out that for Dewey causation is
taken as having a logical, rather than an ontological, character. Nagel
said that he thought that Dewey’s analysis of causation was one of his
most successful efforts. His attempt to give causation a logical status,
to place it in the context of inquiry, and to deny it a status in nature,
as such, represented an attempt to say something important and new.
Dewey continually tries to make it clear that we must be wary of asserting
anything about order in nature. Causal laws for Dewey are a means of
introducing links between events, but he emphasizes that in his view the
links do not exist prior to inquiry. In this respect he is very much against
Mill’s view of causal laws as necessary and unconditional. According to
Dewey, what Mill should have said is that causal laws are means by
which a certain kind of uniformity between events can be established.
Dewey continually reiterates, in contrast to Mill, that the subject of science
is not sequences of events but the establishing of links between traits or
characters of events.
In summing up Dewey’s analysis of causality, Nagel expressed the
view that Dewey’s analysis of causation is one of the most original parts
of the instrumentalist-pragmatist position. The particularly distinctive
feature is the view that all general propositions have a significance only
within inquiry. To ask the question that has repeatedly been asked in the
history of philosophy about the representativeness, or even the ground
of general propositions, is to raise questions that pull the propositions out
of context. What a general proposition is can only be determined by what
it is used for. Interpreted this way, we do not take general propositions as
representative of the structure of nature as such. It is fair to claim that the
adoption of this view permits a wholesale “deontologizing” of a wide range
of propositions. This view, which Dewey defends so consistently, provides
quite a fresh viewpoint in the history of philosophy. Regarding the further
question of whether Dewey can disprove the ontological interpretation
against which he argues so vigorously, Nagel said it seems fair to say that
Dewey can show only that such an ontological interpretation is not
required. The assigning of an ontological status to propositions is in no
way necessary for understanding them. However, the final question to be
asked of Dewey is whether he can avoid by this move all questions of
metaphysical status. Nagel ended the lecture by stating that this question
is a matter of considerable debate.
The reader should remember that the synopsis of Nagel’s lectures given
here is a very much abbreviated version, and, perhaps just as important,
is a version based on the notes of a new and rather na'ive student of
philosophy. If the reader feels that some points are put too simply or
inaccurately, the fault is almost surely mine rather than his. In spite of
such faults, the recording of these notes may still be of some service, for
PATRICK SUPPES 25

Dewey has possibly the most impenetrable prose style of any serious
philosopher since Hegel. On the other hand, like Hegel, he has important
and fundamental things to say.
It seems fitting to end with the closing lines of Dewey’s Logic, which
describe so well not only the major thrust of his work but the dominating
spirit of Ernest Nagel’s philosophy as well.

Since scientific methods simply exhibit free intelligence operating in the best
manner available at a given time, the cultural waste, confusion, and distortion
that results from the failure to use these methods, in all fields in connection
with all problems, is incalculable. These considerations reinforce the claim of
logical theory, as the theory of inquiry, to assume and to hold a position of
primary human importance.
SOME DIFFICULTIES IN KNOWING
Stuart Hampshire

There are at least three distinct kinds of challenge to, or rebuttals of,
a claim to knowledge: The first is the simple rebuttal—“What you claim
to know to be true is not true”; the second is a challenge which questions
the source of the knowledge or the method by which the alleged knowledge
has been obtained. This challenge is.commonly expressed in the words
“How do you know?” When such a question is put as a challenge, it is im¬
plied that the claim to genuine knowledge is not acceptable unless a reliable
source, or a reliable method, has been used in the particular case. A claim
to knowledge is not to be respected unless the knowledge claimed has a
respectable origin; the speaker may be required to show that he is an
authority on the particular issue, as he has implicitly claimed to be. He
must be able to show that he is in a position to know that which he claims
to know. Otherwise he is exposed to the rebuttal: “What you say may be
true—but you cannot now possibly know that it is.” A third and different
challenge, or different kind of challenge, may be phrased in various ways,
but is commonly expressed in the words “Are you sure?” Human beings
are apt to err: a liability to mistake attends all their performances, not
excluding their search for knowledge. They make mere mistakes, as it
seems, by chance, or perhaps even inexplicably. They use words care¬
lessly, inaccurately. They forget things. They overlook things. They write
down the wrong number for no reason at all, or for no reason that they can
give. They often have a reliable source and a reliable method; they have been
asked the time; they have a watch; but then they misread the figure on
the dial carelessly. A machine may misread the figure on the dial, but not
carelessly; for it does not employ, or need to employ, care in reading it
correctly. But human beings do. They are not instruments: or, if they
are instruments, they are instruments that use themselves, and they
may misuse themselves. They need to be careful in order finally to be
sure, because they may go wrong without anyone having a reliable
method of finding out why they have gone wrong. “Are you sure?” unlike
“How do you know?” asks, among other things, whether you have on
this occasion been careful, or careful enough, in using a generally reliable
26
STUART HAMPSHIRE 27

method. Of course you can also make a machine check its results; but
this is not the same as asking it whether it is sure that it has not made
a mistake. A claim to knowledge is certainly not to be respected unless
the knowledge claimed has a respectable origin. But more than this can
properly be asked: “Are you sure that you have not just made a mistake?”
I am just mentioning well-known facts here. These well-known facts are
of the first importance in understanding the notion of knowledge, and per¬
haps also the concept of following a rule. I check the proof to be sure that
I have not made a mere mistake, just a slip, somewhere in the derivation.
I look at my watch again when the small boy who has asked me the time
says, “Are you sure?” This kind of fallibility, or corrigibility, is always in
the background, and with some of us, on some topics, it has to be in the
foreground when we claim to know. I may be in the best possible position
to know that something is the case, and yet I may throw away my ad¬
vantage, inadvertently, carelessly, incompetently, for no reason at all. This
is why I can be asked “Are you sure that you are right?” when I have
claimed to know something on occasions when it would be absurd to ask
me how I know. I may be reminded of one kind of fallibility even when
the other kind of fallibility—fallibility in respect of source or method—is
not in question.
There notoriously are occasions when the question “How do you know?”
would be at least absurd, and perhaps unintelligible, as a question. These
are the same occasions on which a statement shows, in virtue of its gram¬
matical form and its topic, these two taken together, that the speaker is in
the best possible position to claim to know that it is true. For example, he
who reports that he is currently experiencing a certain sensation cannot
intelligibly be asked how he knows that he is; it is already shown, in the
grammar and vocabulary of the statement itself, that he is in the best
possible position to claim to know that his statement is true. The grammar
and vocabulary show that he is the authority on this matter, that he is in
the optimum position for making that statement. But he can intelligibly be
asked whether he is sure that his report is correct; doctors often do put
just this question, and it is sometimes, and in abnormal conditions, difficult
to answer it; on the other hand, it is often, and in normal conditions, very
easy indeed to answer it—e.g., when the description is not a very specific
one or when the sensation is a familiar one. Of the first person, present
tense reports of sensations, one may say that it is evident that the speaker
is the authority. But he is also a fallible authority, particularly if he attempts
a very specific description, or a description that is in some other way am¬
bitious and highly informative, or when he describes a sensation that is
very unfamiliar.
We may therefore, as a preliminary, divide statements about states of
mind, attitudes, and desires into two classes: first, those statements which
admit the challenge “How do you know?” when a claim to knowledge has
28 1. SCIENCE AND INQUIRY

been made, as well as the less specific challenge “Are you sure?” In the
second class, we have those statements which show, in their grammar and
vocabulary, that the speaker is in the best possible position for claiming to
know that the statement is true, that he is the authority, and that no ques¬
tion about the source of his knowledge arises; then only the less specific
challenge “Are you sure?” is in place, a challenge that requires the speaker
to think again about the statement in case he has been careless in making
it and has not on this occasion taken due precautions against mistake. This
latter challenge is appropriate to any claim to knowledge of any kind,
whatever the grammar and the vocabulary of the statement may be. But it
will be especially appropriate when the topic is complex, or when for some
reason it requires a careful use of words or careful matching of a specific
description, or when there is evidence that the speaker has been hasty or
careless or that he has been unreliable in the past. We all know, for ex¬
ample, how difficult it may be to describe sensations when they constitute
symptoms to a doctor; there is the difficulty of not exaggerating or under¬
stating, of distinguishing between that which is more correctly called pain
and that which is more correctly called discomfort. And if we are asked
“What kind of pain?” or “What is the pain like?” we know how difficult it
may be to give a more specific description and to avoid error in matching
the right description to the phenomenon.
With this kind of possibility of mistake, I may hesitate, be doubtful,
and need to stop to think before answering a question, even though I am
clearly in the best position for claiming to know the correct answer; the
grammar and vocabulary of the question alone may show that, if I do
claim to know the answer, no question would arise of how I know or what
the source of my knowledge is. Yet I hesitate, am doubtful, and do not know
what the correct answer is. Someone else may think that he knows what
the correct answer is, and his belief may prove correct, even though I, the
authority, do not yet know. The doctor may know: he has seen lots of
cases like mine; perhaps he has had the disease himself, or he has heard
lots of reports, and he is thoroughly familiar with the appropriate vocab¬
ulary. He has used a reliable method of inference; and I may subsequently
confirm that his conclusion was correct. He had arrived at a conclusion
that turns out to be correct by an argument that establishes that so-and-so
must be true or is likely to be true. If after careful attention to the phe¬
nomenon and to the description, I confirm his conclusion, I do not present
the conclusion of an argument; I report what I claim to know, or what I
believe, without benefit of argument. In the old-fashioned, much abused
phrase, I now either know, or I think I know, directly.
Because of the possibility of carelessness of different kinds and the
possibility of inaccuracy in the use of words or symbols, it is never an
offense against the proper use of language for someone to argue that he
who is in the best possible position for making a certain statement is prob-
STUART HAMPSHIRE 29

ably, in a particular case, mistaken. It may, in the circumstances of a


particular case, be irrational and silly so to argue; but it will not be an
offense against the proper use of language, as it would be to ask “How
do you know that you are in great pain?” as opposed to “Are you sure that
‘great pain’ is not an overstatement and untrue?” Because of this kind of
corrigibility, an inductive argument may be relevant to establishing the
truth or falsity of a statement which the speaker himself was in a position
to claim to know to be true without any appeal to evidence or argument.
From the fact that we may sometimes have good inductive reasons to be¬
lieve that a man is probably mistaken in the report that he gives, for
example, of his sensations, we must not conclude that such statements
require inductive evidence as their support when a claim to knowledge is
made. They do not require inductive support when the speaker is also the
designated subject of the statement; but an inductive inference may on
occasion lead us to the conclusion that such a statement, made on the best
authority, by a man trying to be truthful, is probably false, or even, in some
cases, that it must be false, where the “must” is the sign of an inference.

II

In the case of sensations, then the subject’s difficulty, if he has one, will
typically be one of matching the right description to the phenomenon
experienced. If he hesitates and is unsure about the right reply to a question
about a present sensation, he may be uncertain about what to call the
sensation or how to classify it. Since truth has always been represented
as an agreement between a statement and that to which it refers, the
subject has a doubt which is a pure case of doubt about the truth of the
various possible descriptions that suggest themselves to him. This is one
particular kind of doubt among others, and a particular kind of difficulty
in achieving a true statement. He needs to be accurate, or exact, to find
words that fit perfectly and that are not just approximately right; and it may
be difficult to get an exact fit in words. On the one side is the phenomenon,
the reality to be described, which is in no way concealed from him, and
is, so to speak, transparent when he attends to it; he therefore does not need
to investigate further, or to probe or to experiment or to approach the
phenomenon from another angle. On the other side, requiring to be
matched, is the commonplace vocabulary in use from which he has to make
a fitting selection. If he is not sure whether a particular description that
suggests itself would be a truthful description, this would not normally be
accounted a case of ignorance; it is not exactly that kind of lack of knowl¬
edge from which he suffers, unless it is some ignorance of the standard
use of the relevant words. It is an uncertainty of another kind. Let us call it
the semantic uncertainty. It is in some respects very like the uncertainty
30 1. SCIENCE AND INQUIRY

that a man may feel when trying to discern whether one pattern exactly
matches another pattern in a different medium. It is uncertainty about
matching possible descriptions with an independent reality. This kind of
uncertainty, which might occur in connection with almost any kind of
empirical statement, is often at its most acute when a man is trying to
find the most fitting and accurate description of his own sensations. Every
kind of statement has its own kind of liability to error, and this matching
of descriptions to the phenomena becomes more crucial when other liabilities
to error have dropped away. Then one may resolve his uncertainty by an
inductive inference. This description is probably the right one. Then he
may tell you, by reconstructing the inference, why he believes that this
must be the correct description; but he will not tell you how he knows;
for his belief evidently falls short of the ideal case and therefore does not
constitute knowledge. He was indeed in a position to know, but he was
unsure; therefore he had to use an eccentric, and not the standard, method
of arriving at his statement. In a parallel case, a man was in a position
to know what was said at a meeting because he was present and heard the
speeches, but he happens not to be sure. He may fill the gap by inferring
what must have been said. If he is asked “Do you know that this is a true
account?” he will have to indicate that he is not claiming the authority
of a witness. He can tell you why (using what evidence) he believes that
this is a true account.

Ill

Consider next another type of case in which a man is asked a question


about himself, in respect of which he is in the best position to know the
answer; and yet he does not know and admits that he is uncertain about it.
He is asked whether he wants, or would like, to go to Italy with a certain
group of people or not; let us suppose that it is certain that his emotions
are strongly engaged by the prospect, and that he is far from being indiffer¬
ent about it. He is not being asked whether he will go; that is, he is not
being asked to decide an immediate practical question. Let us suppose that
the inquirer is intensely curious about the subject’s state of mind, as it
concerns this question, and that there is no practical possibility immediately
in view at the time of the inquiry. The questioner wants the subject to re¬
flect and to tell him what his desires are; the questioner might even ask
how he feels about going to Italy. The request is a request for information.
The subject reflects and replies: “I do not know whether I want to go
or not. I can’t tell you yet; you must allow me to think about it further.”
We are very often uncertain about what we want to do in a specific
matter which is very far from indifferent to us. “I don’t know what I want
STUART HAMPSHIRE 31

to do about it: I am not sure whether I want to or not” is a form of words


often used when, for example, a man has a divided mind about the
project; perhaps there are features of it that are attractive to him and
features that are repugnant. Or perhaps he feels that he has not thought
about it enough, or perhaps he is confused about it. Or again there is a
wide range of different types of cases where “I am not sure what I want”
may express an uncertainty in specifying precisely the nature of the object
wanted rather than an uncertainty about the balance of desirability in an
already fully specified object. Or the uncertainty may be an unsureness in
focusing my desire on its proper object and in eliminating alternatives. Or
the subject may be vacillating and in a state of turmoil about the matter.
Anyone who (in delineating the notion of knowledge) respects the actual
uses of language will admit that uncertainties of this general kind are as
genuine as any other cases of uncertainty. They are genuine cases of not
knowing. As I may be uncertain about the real properties of some object
before me, and as I may be uncertain both about the truth of some highly
specific statement about my sensations, so I may be uncertain again when
asked to make some statement about my desires or my aims, ambitions,
hopes, attitudes, sentiments.
When I am not sure what I want to do or want to have, because my
desires are confused or inchoate or because they conflict or because they
are not clearly formed, I may try to end the uncertainty; for someone re¬
quires the information about myself which only I can give him author¬
itatively. When in this situation I stop to think, my problem is not gen¬
erally, typically, that of matching words to an independently recognized
reality; my uncertainty is not usually a semantic uncertainty, although it
could be that, or it could be that as well. In the journey-to-Italy situation,
I would naturally consider the merits of the proposed courses of action. In
determining what I want to do when I am in a state of uncertainty and
am not sure and do not know, I attend to the features of the possible
courses of action. If after careful reflection, I announce my conclusion in
the words “I now know the answer to your question: I now know what
I want,” I would normally be able to give my reasons, the considerations
that have led me to the conclusion; now I can say “I now know what I
want.” The reasons would be reasons for wanting, and they would not
naturally be counted as evidence that a certain account of my already ex¬
isting wants is true.
For these reasons some typical cases of “I do not know what I want”
may be assimilated to some typical cases of “I do not know what I shall
do,” in respect of the kind of knowledge involved. And it is not surprising
that making up one’s mind, or coming to know, what one wants to do is
often very like the formation of an intention: very like, in that the required
precautions against error are very often precautions against misguidedness
32 1. SCIENCE AND INQUIRY

or incorrectness in the desire and the intention rather than precautions


against incorrectness in the statement of the desire or intention, taken as
something already independently formed.
There are contrasting cases when a man notices and reports his desires,
cravings, and impulses as phenomena of experience, exactly as he may
report his sensations; he notices that he wants a drink or finds that, being
hungry, he wants to eat. In such a situation a man may be uncertain about
the correct characterization (or the name) of the thing that he wants,
although there is an acceptable sense in which he does know what he wants;
for he may know that he would unhesitatingly and unerringly recognize
the particular thing, or the kind of thing, that he wants if it were produced
before him. In such cases his knowledge of what he wants, when attained,
is knowledge of a fact about himself, closely parallel with the fact that
he has such-and-such a sensation in his leg. He may often be surprised to
discover, or to notice, that at this moment he has an impulse, or desire,
which he had not expected that he would have. In such cases he would not
be said to have formed the desire, even less to have formed it as the con¬
clusion of a process of thought; rather he has come across it, as a fact of
his consciousness. The desire occurred and emerged in his consciousness,
independently of any calculations. The word “lust,” for example, par¬
ticularly if it has a sexual connotation, is almost, but not quite, the name of
a sensation. There is a whole spectrum of cases between the felt bodily
craving, which approximates to a sensation in respect of the kind of knowl¬
edge that we have of it, and the reflective desire or interest that is formed
as the outcome of a process of thought. A man may, for example, be led to
recognize that he has at this time (already) a desire which he had not
known that he had and which a friend had inferred from the evidence of his
behavior. The desire was not something that he had felt; nor was it a desire
that he had formed; he had not come to know of its existence in either of
these ways. Prompted by his friend, he had originally inferred its existence
from his behavior—“This must be what I have been wanting to do.”
But here I want to concentrate on the case of the man who does not know
whether he wants to go to Italy and who has to stop to think whether he
does. If from this initial state of uncertainty he moves to a conclusion
which amounts to his now knowing what he wants, or to his now knowing
what his attitude is, his process of thought is properly characterized as delib¬
eration. Deliberation is a process of thought that begins with uncertainty
and that is aimed at some conclusion, accepted by the subject, of the form
“This is to be true of me.” The uncertainty from which the process of
thought started was not an uncertainty about the matching of a statement
with an independent reality. The uncertainty that leads to deliberation is
always an uncertainty about what is to be the case and not about what
is or was the case. It is an uncertainty about the future, conceived as alter¬
able, one way or the other, by removing the uncertainty.
STUART HAMPSHIRE 33

Formation of a present belief, desire, intention, attitude, or sentiment is


a case of coming to know what my belief, desire, interest, attitude, or
sentiment is to be, starting from now, and not of coming to know what
it already is. Coming to know or to be sure what my attitudes and senti¬
ments about something are counts as a decision, insofar as the subject
has aimed, in reaching his conclusion, at some kind of appropriateness in
attitudes or sentiments taken in relation with their object. Perhaps one can
distinguish this intentional self-knowledge and intentional uncertainty from
semantic uncertainty in this way: if the subject is still uncertain and is
wondering whether he has a certain aim or ambition or not, then an ob¬
server cannot know either that he does have the aim or ambition or that
he does not, and any knowledge that an observer possesses on this topic
must be knowledge of the future. An observer may believe that the sub¬
ject has had an aim or ambition, perhaps an unconscious one, which can
be inferred from his past behavior. But if the subject is still doubtful about
his present aims, the observer will not claim to know that the subject still
has this aim, definitely and without qualification, unless and until the sub¬
ject’s uncertainty ends. A doctor may know what I am now feeling and
that a statement about my present sensations is true, even when I am my¬
self still uncertain what the correct account of my sensations is. In respect
of aims and ambitions, and that which I want to achieve, an uncertainty
in giving a correct account, whether for my own benefit or for the benefit
of others, usually amounts to an uncertainty of aim. The opposite case
would be the situation in which I am uncertain of the right words to
describe my objective. The verb “believe,” being (like “try”) a strongly
intentional verb, scarcely admits of this opposite case. But there are many
situations in which I do not know which of two things I believe about
something, and therefore no one else can know what I believe until I
make up my mind and thereby come to know what I believe. An observer
can only predict that I will make up my mind in a particular way. And
he, no less than I, may infer from my behavior and from other evidence
that I must have believed so-and-so before now, but not that I must now
believe so-and-so when I myself am doubtful. So I may say of my friend—
“He cannot know what I want to do, because I do not yet know myself.”
He can only use his psychological knowledge to predict what I will want
to do.

IV

It seems that if I am still uncertain about what I want to do, I do not,


as my friend might, use the scientific knowledge of causes that I may happen
to possess to settle the question of what I now want to do by some kind
of inductive inference. There does seem to be a real difference between on
34 1. SCIENCE AND INQUIRY

the one hand self-knowledge, in the sense of contemporary knowledge


of one’s own mind at the present time, and, on the other, knowledge of
the desires, beliefs, and sentiments of others, or knowledge of one’s own
past, of the history of one’s desires, beliefs, and sentiments, which one may
regard exactly as one regards the history of another person. Referring to
one’s own past, one may try to explain why the sequence of desires and
attitudes was what it was. This might be a standard causal type of ex¬
planation, because the facts to be explained can be established prior to, and
independently of, the explanation of them.
Consider two different questions: (1) “What do I now want to do?”
and (2) “What will I want to do?” A man who has an adequate psy¬
chological theory may in many cases use his knowledge of causal con¬
nections to predict what he will want to do, if and when certain sufficient
conditions are satisfied. He knows that he will want to eat later in the day.
He knows from experience that he will want to laugh when he meets his
old friend, or that he will want to run away when he meets his enemy. But
it does not follow that he can use his knowledge of causes to discover
what his wants are in cases where he is doubtful and does not already
know what he wants to do. If he finds that he does actually feel certain
desires, he will then use his psychological knowledge to explain why his
present desires are as he finds them to be. Suppose a man believes that if
he knew the relevant covering laws, and if he knew that the initial condi¬
tions stated were satisfied, he would know that it must be true that he will
want to do so-and-so. It must be true that he will want to eat before six
o’clock. But could he settle his uncertainty about what he now wants to do
by a similar inductive inference? If he could, then the conclusion of the
inference could be expressed in the words “Given that such-and-such
sufficient conditions have occurred, I must want so-and-so”: or, more
plausibly, “So-and-so must be what I want.” But then it seems that he
will naturally ask—“But is it?” He will look for an endorsement of this
conclusion. He will ask himself whether he finally does want what he has
inferred that he must want. An observer might have said to him, “I
know all about you and about people in your condition: so-and-so must
be what you want.” But he will wait for the subject’s confirmation; for
while the subject is still doubtful whether the so-and-so is what he wants,
his desire is still inchoate and unformed.
There is a typical use of the word “must” which will, I think, serve to
mark the distinction between the two kinds of knowledge which I am
pursuing; this is the “must” that marks an inference from evidence that
compels one to draw a certain conclusion. This is the use of “must” in
such statements as “This must be my hat” or “Your hat must be upstairs
since it is not here” or “He must have been very unhappy when he heard
the news.” It may be useful to interpolate something about this use of
“must” in claims to knowledge of matters of fact.
STUART HAMPSHIRE 35

It has sometimes been said, mistakenly, that this use of the modal form
makes a stronger statement of fact than the corresponding plain indicative
statements—“This is your hat” and “Your hat is upstairs” and “He is very
unhappy.” If by a “stronger statement” is meant a statement which claims
a greater degree of certainty, this is surely not true. The flat assertion “This
is your hat” would commonly be taken to imply a greater degree of cer¬
tainty than the more tentative “This must be your hat,” if any implications
at all about certainty can be drawn from the wording of these statements
alone. The function of “must” in such cases as these is to imply that the
assertion about the hat is the conclusion of an inference, when the statement
might have been known to be true directly and without the aid of inference.
In a well-known short story by Saki, a woman, descending from a train
at a country station, is accosted by a stranger who says,—“You must be the
new governess.” The woman replies, “Well, if I must, I must,” and the
story proceeds from there, from this willful confusion of two kinds of
necessity. Had the stranger simply said, “You are the new governess,”
she might simply have denied it. The necessity in “You must be the new
governess” is contrasted with the possibility represented by the weaker
form of modal statement “You may be the new governess”; both show
something of the weight of evidence for the conclusion, in the one case
compelling evidence, in the other not. Because “It must have been X” and
“It must be X” indicate the conclusion of an inference, they often do not
make such a strong claim, in respect of certainty, as the unmodified “It
was X” and “It is X.” The man who says, “It must have rained” makes
no stronger claim, in respect of certainty, than the man who says, “It was
raining”; the former is typically the man who has seen the wet pave¬
ments, the latter the man who remembers the rain coming down.
So much is true of putative statements of fact about the present and the
past. But the situation in respect of statements about the future course of
events is quite different. Here the contrast between statements which are
the conclusions of an inference and statements known to be true non-
inferentially is at least far from obvious. If I am considering the future
course of events, nothing corresponds to my just seeing my hat and
recognizing it, or my having seen the rain and having remembered it. For
this reason we may be inclined to say that all statements about the future
course of events, other than statements of intention, are known to be true,
if known at all, by some kind of inference. “Fascism will not happen here,”
no less than “Fascism cannot happen here,” will be supported by some
inference; “It will happen” no less than “It must happen” will be sup¬
ported, in face of the challenge “How do you know that it will?” by an
appeal to some evidence that it will, or to some method of inference, and
not by any claim to the authority of direct knowledge, parallel to “I can
see it” or “I remember it.” For this reason there is a definite sense in which
“It must happen” is normally stronger than “It will happen.” “It must hap-
36 1. SCIENCE AND INQUIRY

pen” here has a greater logical force, because it says both that it will happen
and that there is compelling evidence that it will. “This horse, Ajax, must
win the race” says more than “This horse, Ajax, will win”: namely, that
there is no possibility of its losing; and this cannot be a correct thing to
say if the evidence does not exclude the possibility of its losing. The pos¬
sibility is typically excluded, and the “must” is justified, when a contrary
outcome is incompatible with some well confirmed general proposition.
Let us then try to transfer this idiom of “must,” marking the conclusion
of an inference that compels assent, into the present tense. A man may say
“I must be in love” or “I must be jealous” because he has noticed features
of his own behavior that amount to compelling evidence that he is. In
respect of these passions, revealed by their typical symptoms, he here
speaks of himself as he might speak of another or of his own past. These
are the facts that he has discovered about himself, and he can tell you how
he knows and of the evidence that has led him to this conclusion. Such
an inference is not unusual. On the other hand, only in very peculiar
circumstances would a man intelligibly say “I must be in pain,” or “I think
I must be in pain”; he scarcely ever infers that he is in pain, and there is not
normally any question of him telling you how he has discovered that he
is in pain, or of his giving you the source of his knowledge. Least of all
would he say “I must be in pain because such-and-such has happened,
which, as you know, causes pain.” There are exceptional circumstances in
which the modal idiom has a use: a man might reasonably say to a
doctor—“I suppose you are right when you infer, on the basis of your
experience and medical knowledge, that I cannot be in great pain, only in
some acute discomfort; perhaps I am exaggerating.” He infers that the
doctor’s description is likely to be, or even must be, the correct one and
that he must be misdescribing; what he is calling great pain is not what
is ordinarily called great pain. He resolves the semantic doubt (exceptional
for the unspecific description “pain,” less so for “great pain”) by an in¬
ductive inference.
But one cannot naturally say of oneself “I must believe p,” although
one may say “I must have believed p at that time.” “I must believe p,”
representing an inference from the evidence of behavior, is a way of saying
“I must have believed p up to now,” the moment of realization. I may infer
from signs and behavioral indications what my past beliefs were—“I must
have believed him”—as I may infer the present beliefs of another. But one
cannot infer what one’s beliefs are to be, starting from now. Either one
already knows, or one has to answer a normative question, to form a belief
on the evidences of truth, as one takes them to be. I may in exceptional
circumstances infer what my beliefs will be in the future. For example, I
might say “I know from experience that when I see him I shall fall under
his influence, and that I will believe what he tells me, even in defiance of
the evidence.” The implication here is that I shall no longer know, re-
STUART HAMPSHIRE 37

member, or realize, when I am in his presence, that my beliefs are formed


under his influence and only under his influence. I will not have in mind
the correct explanation, as I now suppose, of my having the belief. I can¬
not transfer this melancholy self-knowledge into the present without in¬
coherence-—“I only believe this story because I am under his influence.”
This is not the expression of a contemporary belief but a kind of irony; for a
declared belief is necessarily endorsed by the subject as to be explained, at
least in part, by his reasons, and as not to be explained wholly by external
causes which are not taken to be evidence of truth. He who can truly claim
to believe p to be true necessarily intends his favorable attitude to p
to be alterable by evidences of error. For this is how believing that p is
true is distinguished from cognate propositional attitudes—e.g., hoping that
p is true, liking to imagine that p is true, wishing p to be true, and many
others. So it would be absurd to arrange an experiment to determine the
causes of beliefs of a certain kind, and to ask the cooperation of the sub¬
jects: “The experimenter will apply certain stimuli and change your en¬
vironment in various ways, and you are to report the change in your beliefs
that are the effects of these external changes.” There would be a conceptual
impossibility in carrying out these instructions. Some apparent psychological
effects could be reported—e.g., a changing inclination to believe a certain
proposition. But the subjects, in the circumstance of the experiment,
would not call these noticeable psychological effects changes in belief. Nor
can a man under posthypnotic suggestion say “I only believe this because
I have been hypnotized to believe it.” My declared beliefs are not facts about
myself that I may discover after a preliminary uncertainty, as I may dis¬
cover, after a preliminary uncertainty, that I must be in the grip of some
passion because I exhibit the symptoms. For the same reason I cannot
intelligibly say “I must intend to do so-and-so,” although I can intelligibly
say “I must have intended to post the letter and then forgotten it.” On the
other hand, I could intelligibly say, using the concept of purpose, and the
present tense, “I see now: this must be my ultimate aim or purpose in
doing this although I had not known that it was.” I may infer what my
more or less unconscious aim or purpose has been up to now from signs in
my overt behavior. But now I have to endorse or repudiate the purpose;
what is to be my purpose, starting from now? Perhaps I do not know the
answer to this question; perhaps I do not know, in respect of some activity
in which I am engaged, what my aim or purpose really is. In removing
this uncertainty, I fix my aim. Similarily with the two-faced concept of
desire: While I make up my mind what I now want, there are no knowable
facts to be expressed in the words “I want to do so-and-so.” If I infer that
I must already really want so-and-so, or that I must already have such-
and-such an unconscious desire, it is still an open question whether I
dissociate myself from this desire, now brought to consciousness. The
“openness” resides in the reference to the immediate future, to “now” as
38 1. SCIENCE AND INQUIRY

meaning “starting from now” as opposed to “up to now.” Given that I have
discovered, perhaps as the conclusion of an inference from my previous
behavior, that I do have the desire, do I want to get rid of this desire, if
I can, or do I now endorse it? Does it now persist as a conscious desire?
The fact that I have learned is a fact about the immediate past, leading
into the present. Starting from now, and from this fact, I may think again
about what is to be desired. In this uncertainty there is room for delibera¬
tion, that is, for determining what is to be true of me.
Suppose that I am doubtful about what I want, and suppose that, under
the influence of an adequate and tested psychological theory, I think that it
must be possible to infer what I now want, just as I infer what another
man now wants on the basis of a theory—“Given so-and-so that has hap¬
pened, he must want so-and-so”; for the relevant initial conditions, and
causal factors, are known; so I conclude that I must now want so-and-so,
given that so-and-so has occurred. Is this odd and unusual? I think it is:
but is it odd and unusual for the same reason that it would be odd for
a doctor to conclude by a parallel argument that he, the doctor, must be
in pain because he has a disease which is always painful? Surely not—
the oddity has a different source. Of the doctor and his pain, one would
merely say that if he already has a pain, he must know directly that he has.
“Am I in pain?” is an unusual question for which a particular context
of semantic uncertainty must be imagined: the context, already men¬
tioned, in which the speaker is doubtful whether he would be exaggerat¬
ing if he described his condition as pain rather than, e.g., discomfort.
But the question “Do I want so-and-so?” is far from unusual and is
intelligible in a very ordinary context. That I should not be sure what
I want is entirely normal. One would not appeal to the principle that
if a man has a desire he must generally, and except in odd circumstances,
know directly that he has; for the very range of cases under consideration
show that no such principle operates. It is an intrinsic feature of desires,
aims, ambitions, purposes, attitudes, that they may be uncertain, unfocused,
confused, inchoate. It was one of the strengths of Aristotle’s Ethics to have
recognized this, as against those empiricists who have taken desires as in¬
variably given facts of consciousness.
The oddity of saying “Applying a reliable psychological theory, I have
come to the conclusion that I must now want this” resides in the kind of
knowledge claimed, which does not match the original uncertainty, ex¬
pressed as “I do not know what I want.” Admittedly there are cases in
which a man may infer that he must really want to do so-and-so, although
he had not realized this until his attention was drawn to the evidence. But
in coming to know this, he has not removed the kind of uncertainty which
he might have expressed in the words “I do not know whether I now want
to do X or not.” If he had been asked “Do you want X?” he might have
answered “I do not know,” and then be convinced that he had been wanting
STUART HAMPSHIRE 39

X. But he still has the immediate future, starting from now, to determine.
The man who does not know what he now wants in a matter that deeply
concerns him, and who also does not know whether he admires an action
or not, normally has a question for decision. Just because he has a question
for decision, it seems that he cannot settle his doubt by a factual inference,
which could only lead him to a conclusion about what must have already
been true before this doubt arose. In removing this present uncertainty, he
has to consider the features of X that make it desirable for him and the
features of Y that make it admirable. In confessing his uncertainty, he has
confessed that his desire for X and his admiration of Y are still mere pos¬
sibilities, to be endorsed or eliminated by him. An observer may infer,
applying a theory, that he will make up his mind in a particular way and
that he will emerge from his deliberations with a particular desire and a
particular attitude. But the subject will not now remove his uncertainty in
this way. If therefore someone were to say to him, in this situation of
uncertainty, “You must want X” and “You must admire Y,” he would
normally take this “must” as prescriptive, as an imperative of rationality or
perhaps even of morality—perhaps as meaning “You have compelling
reasons for wanting X and for admiring Y.” There would be an implied
allusion to some standard of correctness—for example, “You must, on
pain of being inconsistent” or “You must, on pain of being utterly mis¬
guided in your desires and admirations.” The “must” would not naturally
be taken as “It must be the case that you do in fact want X and admire Y.”
Suppose I say to you: “You must want to save your friend.” I may intend
this to be taken as “It must be true that you do. No human being could
be so unfeeling.” But you would not express agreement with this proposition
in the words “All right, I must.” This would be Saki’s governess again.
You would say “Strangely enough, I don’t” or “I do.” Either you know
whether you do or do not, or it is not yet true either that you do or you
do not, just because you are still uncertain about it, after the question has
been put to you. Even if the inference to the desire had been correct,
and the desire was an unconscious desire unrecognized by the subject, it
still would not be true that you simply want now to help your friend if,
at the conscious level, you are now doubtful. However, “You must have
wanted X and admired Y” would be construed as “It must be the case that
you did.” “I must have wanted X and admired Y” would in similar fashion
be construed as meaning “It must be the case that I did.” On the other
hand “I suppose I must still want X” would often be interpreted as “I
suppose I must if I am to be consistent and rational.” I might be committed,
logically or otherwise, to wanting X by my other desires and admirations. If
the “must” is expanded as “I must, on the pain of being inconsistent or ir¬
rational,” then it is not misleading, I think, to characterize the statement as
normative. It is normative in the same way as “I must accept the conclusion
of this argument” or “This cannot be doubted,” which does not normally
40 1. SCIENCE AND INQUIRY

represent a psychological impossibility. But “I suppose I must still want X


might represent the discovery and acknowledgment of a desire, particularly
a long-range, or so-called dispositional, desire, which has existed up to now;
so a man might say “I suppose I must still want to please him,” acknowledg¬
ing this fact about himself. And then the question arises—“Given that it is an
inferred fact that I have this desire, do I want to get rid of it or not?” He
has discovered and acknowledged that he has wanted to please up to now;
but does this desire, once recognized, disappear, and if it does not, but
rather lingers, does he wish that it would disappear? In any case his state
of mind and its explanation have become more complex.
I view the history of my past states of mind and attitudes from the
standpoint of an observer; that I am writing an autobiography rather than
a biography imposes no limits upon the inferences that I may make about
my desires and attitudes, as they have existed up to now, or upon the
explanations that I may give of them. The settlement of any doubts that I
may have about what these states of mind actually were does not modify
the states of mind themselves; in recognizing them for what they were,
I do not endorse them. But while I do not yet know what I want and am still
uncertain what to aim at, the adequate psychological theory cannot be
applied, just because uncertainty exists in my mind, and uncertainty of a
particular kind, the kind of uncertainty which is neither ignorance, nor a
semantic uncertainty, but rather an intentional uncertainty, which raises, at
least in part, a normative question. Imagine a man who believed that he
had an adequate psychological theory which would never leave him un¬
certain about his future conscious desires, aims, ambitions, and attitudes.
He would expect to find himself, as it were, saddled with desires, aims,
ambitions, with states and attitudes, which he had all along expected to
occur in the foreseen conditions of his particular case. He would expect
all his own desires, aims, ambitions, and attitudes to occur, just as the
ordinary man now expects that, after his long fast, he will want to eat,
or that under certain conditions he will have a craving for sugar, or for
alcohol, or a longing to fall asleep. Such a psychological theorist would
expect to find himself, under foreseeable conditions, acquiring a certain
ambition or falling into an attitude of admiration, as the ordinary man
expects to find himself, under foreseeable conditions, becoming angry or
frightened or jealous, or having an impulse to run away, or becoming
depressed about something.
There are indeed desires, emotions, moods, which I know will descend
upon me under certain conditions; I can often infer and then observe their
occurrence in myself no less than in other people. There is nothing in the
characterization of such desires, emotions, moods which excludes the pos¬
sibility of my being the passive and helpless observer of them, and my
knowing of their existence in this way. I may, without contradiction, be said
to observe myself “falling into” these states or being subject to these
STUART HAMPSHIRE 41

impulses; I can see myself in these respects as a specimen. I know that I


will want this or fear that or be disappointed by the other. But then
further uncertainty occurs at a higher level, which by itself introduces a
new complexity into my state of mind. For there is always the question of
the attitude that I am now to adopt to these presumed facts or presumed
probabilities about myself: do I want them to be otherwise or not? The
“doubling” of my references to myself is unavoidable, as long as I have the
means to infer and to notice these facts about myself.
As soon as we develop this hypothesis of the man who applies a psy¬
chological theory in self-prediction, we see that he might be led to discard
many of the idioms that a man now uses to talk about himself, about his
aims, ambitions, interests, hopes, and attitudes. The state into which the
“psychological theorist” might expect himself to fall, a state in some
respects like the attitude of admiration, still might not be admiration. The
state into which the man falls might be like admiration in, for example, the
behavior that accompanies it. It would not be admiration unless the pre¬
dictor was ready to ask himself whether he thought that the action or person
admired was in some respects admirable; having “fallen into” the state that
resembles admiration, he would not be said to admire unless he thought
that his state was to some degree appropriate to its object. Even if he
admires only reluctantly or unwillingly, he still must think of the object
as in some way admirable. If this thought that the object is in some way
admirable is predicted by him, the prediction will be justified, at least in
part, by reference to the predicted features of the object that will explain
his thinking the object admirable. Will this thought be uncontrollable?
I labor these points because, through the empiricist tradition, we have
received an idea of knowledge, and therefore of self-knowledge, as essen¬
tially and in all cases, knowledge of something that has an independent
existence, independent, that is, of the knowing. The tradition therefore has
difficulty in admitting that the very same process of thought may be both
a coming to be sure or to know, that something is to be the case and a
process of making it the case. The reasoning that makes me sure that some¬
thing is true of me is sometimes also the reasoning that makes it true of
me—e.g., I have to admit that I want X, or that I am trying to do X. The
empiricist tradition freely admits that a man has genuine knowledge of his
own given sensations and impressions, which are kinds of transparent objects
that we encounter in our experience. Sensations are taken within the tradi¬
tion to differ from external physical objects in that we see right through
them, and that we do not need to investigate them from different points
of view in order to know what they are; they are presented all at once and
in their entirety, and there is no distance between the observer and the
object, which needs to be judged in determining what the object really is.
There is no difference between their surface and their substance.
In recent years it has been widely admitted that the more complex
42 1. SCIENCE AND INQUIRY

states of mind, passions, and emotions do often need to be investigated


by the subject, and the elements and aspects seen as a pattern, before a
man can be sure what they are, even though they are his own states of
mind. It is sometimes also admitted that the subject’s knowledge of their
proper characterization may be essential to the states being characterized
in a particular way, and therefore that self-knowledge makes the states of
mind different, and more complex, from what they would otherwise be.
It is recognized that there are states of mind—for example, the state of
being indignant about something or of being embarrassed by it—which are
of their nature fully intentional states, just because this kind of self-
knowledge is constitutive of them. Someone—for example, a very young
child—who does not possess any of the concepts which would enable him
to discriminate his state as one of indignation, as opposed, for example,
to anger, could not in fact be indignant, although he could be angry. This
is the feature of self-knowledge on which I am insisting because I think
that, following this path, one can come to see what is peculiar to inten¬
tional knowledge, merely as a kind of knowledge. One difference between
being made to feel uncomfortable by something that someone has just said
and being embarrassed by his remark is that in the latter case the subject
believes or knows that the remark is part of the explanation of his feeling;
his knowledge of the explanation of his feeling distinctly modifies the state
of mind. A remark may have the effect, or consequence, that he feels
uncomfortable, without his recognizing that his feeling is caused by, or is
the consequence of, just that remark. If he does come to believe, or to
know, that his feeling is the effect of that remark, then a further question
arises for him—“Is there anything about the remark that is embarrassing?
Was it an embarrassing remark?” That is, was there ground or good reason
for embarrassment? Certainly he may recognize that he is embarrassed
by it, while admitting that this is to some degree an inappropriate, or even
a ridiculous, feeling for him to have about it. But he must believe that his
feeling has some minimum appropriateness, if it is to count as a case of
embarrassment. Of a child, it may be true that he is feeling uncomfortable
and ill at ease, and that this is the effect of a cause, namely, a remark
made which touched off certain associations in his mind. But it might be
false, and in some cases absurd, to say that he was embarrassed at or by
the remark, and, even more absurd, to say that he found the remark em¬
barrassing. The reflexive self-knowledge is essential in converting a case
of being made uncomfortable by it, where this is an instance of correlation
between stimulus and effect, into a case of being embarrassed by it or
ashamed of it or guilty about it, where these are typical intentional relations,
constituted and distinguished from each other by the accompanying
knowledge of, or belief about, the partial explanation of the feeling. The
knowledge is arrived at by considering the properties of the object which
constitute reasons for being embarrassed, ashamed, or guilty. I am em-
STUART HAMPSHIRE 43

barrassed, ashamed, or guilty because the object seems to me to have


certain properties, and I would mention my noticing of these properties,
or my belief that the object has these properties, in giving a partial ex¬
planation of why I am embarrassed, ashamed, or guilty. Each of these
states has a standard target, or paradigm case. There is something that
it is normal to be ashamed of, embarrassed by, guilty about. This is part
of the sense, I think, of Aristotle’s doctrine of the mean in respect of the
passions: that there is, built into the concept of any one of the passions,
a norm of appropriateness in the object of the passion. If I have not been
sure whether I am embarrassed by something or not, then my becoming
sure is often the outcome of my being sure that there is something normally
embarrassing about it, that there is something in the object to explain the
embarrassment as not altogether inappropriate. I can usually say why I
am embarrassed by it, even if I admit that the reasons are insufficient;
giving my reasons for the attitude or feeling is in such cases more like giv¬
ing my reasons for a present or future action than it is like finding a causal
explanation of a state of mind which I have independently identified. If
the state of mind is the outcome of reflection or deliberation, I have not
identified it as existing independently of the reasons that I now take to be
the partial explanation of it. In the case imagined, when I ask myself the
question “Is this uneasy feeling that I have embarrassment about so-and-
so?” I look for an explanation of the feeling, and, as a result of this search
for an explanation, my state of mind may cease to be what it was before.
Originally it was only true that I felt uneasy; and then, with my recog¬
nition of the source of this feeling as a normal one, it became true that
I was embarrassed by the object, e.g., a remark made.
The situation is very like that in which a man’s attention is drawn to
something that he is doing without his being distinctly aware that he is
doing it. His action acquires a new character in virtue of his knowledge
that he is doing it; new descriptions are applicable to his performances in
virtue of the fact that they are no longer unintentional in these respects.
Previously there was a cause: now there is room for a reason, and for an
explanation of his doing it, known to him, which, in virtue of this knowl¬
edge, changes the character of the action.
The difference between “He is breaking the box,” where this is the
causal judgment that he is the cause of its breaking, and “He is breaking
the box” in a stronger sense, including the intention to break it, is very like
the difference between “He is made uncomfortable by it” and “He is
embarrassed by it.” In virtue of his knowing what he is doing, the subject
will normally be in a position to explain his action by some reason that
he has for doing it, or at least authoritatively to reject other explanations that
are suggested. He can give some explanation of why he is doing it, however
rough and vague and incomplete.
To summarize: “I do not know what I shall now do,” “I do not know
44 1. SCIENCE AND INQUIRY

what I want,” “I do not know what I feel about it,” “I do not know what
my attitude to it is,” typically confess uncertainty, not about what is already
true, but about what is to be true of me, starting from now. This requires
a process of reflection, akin to deliberation, which is aimed at the kind of
knowledge or certainty which has as its outcome that I do now intend to do
so-and-so, want so-and-so, feel so-and-so, have such-and-such an attitude,
or sentiment. If I start from an uncertainty about myself of this kind—
about what is to be true of me in these respects—I will not infer from a
statement of initial conditions together with some general psychological
proposition what must be true of me, starting from now, as I may infer
what must have been true of me up to now. If deliberation ends with the
conclusion that I must, starting now, want so-and-so, feel so-and-so, have
such-and-such an attitude, the “must” will allude to some standard of
rationality or morality: so to a kind of correctness, which is not truth.
A present tense question put to me as a request for information—“Do
you want?” “What is your attitude to this?”—places me in an ambiguous
position, a position of facing both ways; I am confronted with the need to
report the facts, as they already are, and with the need to make a decision;
for I may dissociate myself, with a second-level desire or attitude, from
the desire or attitude which I find that I already have. The request for
information forces me into this position of active self-consciousness; and the
asking of the question may change the facts just by engendering this self-
consciousness. In response to the question, I may for the first time
recognize that I in fact have a desire or attitude that I now wish that I did
not have. The movement from knowledge of the facts about oneself, about
one’s desires and attitudes up to this moment, to a knowledge of that which
is to be true of me is a continuous movement of consciousness; it is so
pervasive that it is easily overlooked. Present tense avowals of desires and
attitudes have been found baffling in their logic, partly because they
mark this movement from the one kind of knowledge to the other. “I want
X” is open to a two-faced challenge, when the question “Are you sure
that you want X?” is asked; the first face of the challenge calls on the
autobiographer not to disregard any evidence, perhaps of behavior, that
suggests that the facts are not yet as he reports; the second challenge calls
on him not to disregard any reasons there may be for not desiring X. The
contrast between evidence and reasons here is a contrast between steps
that lead to, and are taken to justify, a conclusion which is a belief about
already wanting X, and steps that lead to, and are taken to justify, wanting
X; justification of both kinds might be needed because the belief that X
is already wanted might be a mistake, and wanting X might be a mistake.
The observer, considering the desires of the subject, would hold these
possibilities of mistake apart; he would know from which point of view
he was considering the desires. But the subject would normally not separate
the factual from the deliberative question. This is part of the asymmetry,
STUART HAMPSHIRE 45

stressed in the controversy about knowledge of other minds, between


first-person present tense statements and other statements about states of
mind, or at least about desires and attitudes.
It is plain that this distinction between two kinds of uncertainty and of
knowledge touches the issue of freedom and determinism—but at what
point and with what consequences? 1 will state the minimum consequence
first. Whatever facts a man may learn about his present desires and at¬
titudes, he always has the higher order question of his desires and attitudes
in the face of these facts—does he want them to be otherwise? He may
sometimes find that he is powerless to change his passionate attitude toward
something, although he wishes to, and that he cannot control or suppress
a desire that he already has. He is then left in a state of conflict. He
might know the explanation of his original desire and understand its causes;
he might have been in a position to predict, accurately and with confidence,
that he would have just this desire; and he might have known that he would
be powerless to prevent himself feeling this desire, although he had wanted
not to feel it. Experimental knowledge of the explanation of his states,
the kind of knowledge that is stressed by a determinist, is knowledge of
the conditions under which he would be able to prevent something being
true of him, if he wanted to prevent it. The “if he wanted to” states a con¬
dition which always arises no matter what scientific explanation of desires
is available. If a man had known that under stated conditions he would
want X and also that he would want not to want X, he has before him
the further question, which arises in virtue of this knowledge, of whether
he wishes this conflict not to occur. Even if he foresees accurately the
sequence of his desires and moods, and also his conflicts of desire, he
would still adopt an attitude and have wishes in respect of this sequence
and of the various elements in it. My argument is that any knowledge which
a man acquires from experiment and observation about his own present
and future states presents him with another potential uncertainty and with
the need of knowledge of another kind, and that this is a feature of knowl¬
edge itself. This feature of knowledge could as well be illustrated by
reference to perceptual knowledge as by knowledge of mental states and
attitudes. Suppose that a man comes to know, for the first time, about some
physiological mechanism of perception; using this knowledge, he is able to
predict what his perceptions will be when a stimulus of a very specific
type is provided; the stimulus is a sufficient condition of a certain physical
state, which is in turn a sufficient condition of his perceiving something—
for example, a certain pattern of colors on the ceiling; he may still be
uncertain, when the perception occurs, whether the perception is veridical
or not, whether his inclination to believe that there is a pattern of colors
on the ceiling is to be endorsed or not. He must then find reasons to accept
or reject the belief, and must actively inquire whether the explanation of
his perception is of the kind that allows it to be a veridical one. As one
46 1. SCIENCE AND INQUIRY

learns about the causes and conditions upon which one’s perceptions de¬
pend, one is in a position to apply this knowledge of causes in distinguish¬
ing veridical from deceiving perceptions; and applying the knowledge in¬
cludes actively performing tests—e.g., by changing the perspective or
angle of vision or by performing some simple interference experiments.
I can legitimately regard myself, with my sensory equipment, as an instru¬
ment that records the presence of objects in the environment; but I am an
instrument which deliberately employs itself to find the answers to ques¬
tions which I have raised. When I find myself inclined to believe that
there is a certain pattern of colors on the ceiling, and not as the conclusion
of an inference, I may still suspend belief until I have performed tests and
found good reasons for believing. I am normally free, and not powerless, to
question any of my beliefs, whether they arise directly from perception
or not. In the setting of questioning an inclination to believe, or of resolving
uncertainty, that which causes me to believe is called a reason for believing;
in a setting in which a perceptual belief has formed without any active
inquiry, or even the shadow of an uncertainty, there must still be an
ascertainable cause of the belief, but it may not be called a reason. If I am
altogether ignorant of the psychology of perception, I will not know which
perceptual clues make me form certain beliefs, e.g., about the distance be¬
tween visible objects. I will not know how I know, and I will say that I know
directly, in cases where there has been no uncertainty resolved by inquiry
and where my belief is true. When I am uncertain and actively investigate,
my final belief has an explanation which includes a reason, the evidence
from which I have inferred. Sometimes my reason may only be the belief
that my first inclinations to believe (my first impressions of distance)
almost always are confirmed, that is, that I am a reliable instrument in this
matter; this is a reason for endorsing my first impression of distance on a
particular occasion. But the further question—“Am I sure that there is
nothing abnormal in the circumstances in which this perceptual belief was
formed?”—is never an absurd question. I do not wish to deny, but rather
to stress, the many differences between knowledge of the external world
and the knowledge that a man may have of his own states of mind, desires,
and attitudes. But the corrigibility in principle of any empirical knowledge
claim, apparent in perceptual claims, has often neglected consequence in
claims to self-knowledge. Suppose that a man knows what he will want to do
when certain sufficient conditions obtain, which he knows will occur very
soon. When the anticipated desire occurs under the foreseen conditions,
he can still raise the question “But am I sure that I do want to do this?
Are there good reasons why I should want this?” Of course he knows
that he does now want it, that this primary desire exists, and, in the case
supposed, he knows why it exists. He is so far like the man who sees the
pattern of colors on the ceiling, and who knows that this is a possible correct
STUART HAMPSHIRE 47

description of what he sees, but is still not sure whether there is in fact a
pattern of colors on the ceiling.
The mere raising of the reflective question has already changed his
state of mind; and this distinguishes self-knowledge of this kind from
knowledge of the external world. For there is now a sense in which he
knows what he wants to do and a sense in which he does not know what
he wants to do. Recognizing this primary desire and knowing the conditions
(perhaps a physical state) that have produced it, he may still be hesitant
and uncertain, and any conclusion of the uncertainty will have a reason
as at least part of its explanation. The uncertainty in this case of desire
amounts to a kind of conflict, and will be plainly a conflict of desire if
the outcome is that he does not want, on reflection, to do that which he
also feels a desire to do. Similarly with attitudes of mind: I may find myself
predictably feeling admiration for someone and being inclined to act
correspondingly, while on reflection I am uncertain about my attitude to
him, because I am uncertain whether there is anything admirable about
him. I have therefore a divided and uncertain attitude, which may be
continuously modified as I continue to reflect upon what my attitude is.
The transition, in self-knowledge, from knowledge of psychological
fact, and of the causes that explain my desires and attitudes, to a reflection
on these facts, which introduces causes that are reasons, depends, firstly,
on a feature of knowledge itself: that a claim to empirical knowledge is
always challengeable by the question “Am I sure that this is so?” This
challenge to reflect upon desires and states of mind, when it is contemporary
with these states, leads to the consideration of reasons. Insofar as a desire
that is to be in part explained by reasons differs from a previous desire
that was independent of these reasons, the reflection by itself brings about
a change.
This conclusion obviously leaves many of the traditional issues as¬
sociated with a thesis of determinism untouched. But it is incompatible
with a particular picture that is often associated with a thesis of determin¬
ism: namely, the picture of advancing psychological knowledge gradually
displacing the kind of uncertainty that is expressed by the questions “Do
I want to do this?” and “What is my attitude to this?” when these questions
call for a decision. Any additions to our systematic knowledge in psychology,
and thus to our power of calculating exactly what we will or would feel
under specifiable conditions, will always leave a place open for this kind
of uncertainty.
ON OBJECTIVE INTENSIONS AND THE
LAW OF INVERSE VARIATION1
R. M. Martin

One of the most fascinating but neglected branches of modern logic is the
semantical theory of intensions. By the intension of a term is meant vari¬
ously the meaning or connotation or Sinn of the term as over and against
its designation or denotation or Bedeutung. Intensions in one form or an¬
other have played a small, but persistent, role in the history of logic. In
modern mathematical logic they have played scarcely any role at all. In¬
tensions in fact are the kind of entity in which mathematicians have shown
little interest. But surely the theory of intensions must play a fundamental
role in philosophical, as contrasted with purely mathematical, logic.
A fundamental “principle” concerning intensions is the so-called law of
inverse variation of extension and intension. There was much controversy
over this law in the latter half of the nineteenth century, and its exact
logical status was in doubt. The controversy seems never to have been set¬
tled and the “law,” if such it be, seems never to have been stated ade¬
quately. Although occasionally mentioned by more recent writers, no one
seems to have probed the logic, if there be such, underlying it. In par¬
ticular, the law seems not to have been examined by such proponents of
intensions as Carnap, Church, Fitch, or Frege. And yet it has often been
regarded as one of the fundamental laws concerning intensions. A clear and
unobjectionable statement of it is necessary if the theory of intensions is
to be given a secure modern footing.
Let us examine critically a little of the fascinating history of this law,
preparatory to attempting to give it a more exact form. We shall not present
a complete account but will merely select a few formulations of outstanding
interest. And if we start in the center, this will at least (in Bradley’s
phrase) get us into the heart of the matter.
Bradley in fact called the law “that preposterous article of orthodox
logic [that] turned the course of our reasoning into senseless miracle.”2
But Bradley never formulated the law very clearly. “Extension and in¬
tension,” he says almost mockingly, “. . . are related and must be related
in a certain way. The less you happen to have of the one, the more there-
48
R. M. MARTIN 49

fore you must have of the other [italics added]. This statement has often
passed itself off as both true and important. I confess that to me it has
always seemed false or frivolous.”3 Frivolous, indeed, lacking clear and
defensible definitions of ‘intension,’ ‘less,’ ‘happen to have,’ ‘must have,’
and no doubt other terms. Such definitions are not, it is to be feared,
to be found anywhere in the annals of idealist logic.
For Bradley, as for others in the Hegelian tradition, the notion of
intension is intimately bound up with the doctrine of the “concrete uni¬
versal.” There, however, ‘intension’ takes on a quite different meaning in
which extension and intension seem to coalesce. Hence naturally, on such
a meaning, a law of inverse variation does not hold.4

II

As good a preliminary statement of the law as any—and better than


most—is to be found in (the fourth edition, 1906, of) Neville Keynes’
Formal Logic.5

(L) In a series of common terms standing to one another in a relation of


subordination the extension and intension vary inversely.

Keynes does not himself accept the law in this form but uses it merely as a
basis for discussion.
We note first that this preliminary formulation of the law concerns
terms, not what the terms designate or denote. Thus straightaway we see
that it is to be a law of syntax or semantics, not of some object-language.
But the terms are common, i.e., are denotative and denote objects in
common, so that the law is clearly a semantical law, not a syntactical one
and not one of logic in the narrow sense (comprising just the theories of
truth-functions, quantifiers, and identity). Further, the terms must be ar¬
ranged in a series, although strictly this is an elaboration which is not
needed. The relation of subordination between terms is presumably a
semantical relation, analogous to class-inclusion. A term a is subordinate
to a term b if and only if every object denoted by a is also denoted by b.6
The extension of a term a is presumably the class (or virtual class) of all
objects denoted by a. So far so good, barring niceties.
The extension of a term, Keynes says (p. 22), “consists of objects of
which the name can be predicated,” whereas “its intension consists of
properties which can be predicated of it,” i.e., of the extension. Intensions
are of three kinds, conventional, subjective, and objective. The conventional
intension, or connotation, of a class-name consists of “those qualities which
are essential to the class in the sense that the name implies them in its
definition. Were any of this set of qualities absent the name would not be
applicable. . . .” (How about primitive class-names, not defined?) Perhaps
50 1. SCIENCE AND INQUIRY

we may make Keynes’ definition a little more precise by regarding the


connotation of a defined class-name ‘P’ as the class of all “qualities” Q
such that necessarily for all x, if x has P then x has Q.
The subjective intension of a term depends upon the individual user
of the term and is “less important from the logical standpoint. 7 Therefore
we need say nothing more about it here.
Finally, there “is the sum-total of qualities actually possessed in com¬
mon by all members of the class. These will include all the qualities included
under the two preceding heads, and usually many others in addition.” This
sum-total comprises the objective intension or comprehension of the class-
name. Perhaps a “sum-total” is a logical sum of classes. Or is it rather a
set? Keynes is not too explicit on this point, although, as we have seen,
the word ‘set’ creeps surreptitiously into his explanation of the meaning
of ‘connotation.’ But a logical sum is not a set, and it is most important
to know which is intended if we are to have here a clear meaning for
‘connotation’ and ‘objective intension.’
Keynes speaks of “classes” as, presumably, the objects designated by
class-names. On the other hand, intensions are sum-totals or sets of
“qualities.” His ontology is thus at best rather mixed.
Keynes distinguishes three laws of variation which “must together be
substituted for the law of inverse variation between extension and intension
in its usual form (L) if full precision of statement is desired.” The first of
these laws is (p. 37):

(K) If the connotation of a term is arbitrarily enlarged or restricted, the


denotation in an assigned universe of discourse will either remain un¬
altered or will change in the opposite direction.

The statement of (K) is surely clearer than that of (L). In particular,


note that ‘vary inversely’ has been dropped. What is meant anyhow by
‘inverse variation’ in connection with the extension and intension of terms?
We have no quantitative measure of terms, so that inverse variation cannot
mean here what it means in physics, for example, in the statement of
Boyle’s law.
Let ‘M’ be a class-term. Let con(‘M’) be the conventional intension or
connotation of ‘M’. What does it mean now to say that the connotation of
‘M’ is “arbitrarily enlarged or restricted”? Presumably by this—in accord
with (L)—we are invited to consider the connotation of another class P
in which M is properly included or which properly includes M. Let
con(‘P’) be the connotation of ‘P’. In saying that the connotation of ‘M’ is
enlarged, we are intended perhaps to consider that con(‘M’) is properly
included in or subordinate to con(‘P’). Of course we can say this only
where con(‘M’) and con(‘P’) are themselves sets (or classes of some sort,
perhaps classes of classes or of “qualities”), so that by ‘inclusion’ we
mean set-inclusion of the type appropriate to sets of “qualities.” If, on the
R. M. MARTIN 51

other hand, conventional intensions (or connotations) are to be construed


as logical sums of classes, inclusion must be of the type appropriate to
classes of individuals. We must distinguish then two statements correspond¬
ing to (K), depending upon whether con(‘P’) and con(‘M’) are taken as
logical sums or as classes. Both of these statements, however, will have
the form

(K') If con(‘P’) is properly included in con(‘M’), then M is properly included


in or identical with P.

(K'), on the set-meaning, although an improvement on (K), is still far


from satisfactory. In particular there is implicit reference to “attributes”
or “characteristics.” It is doubtful that we need recognize a separate realm
of attributes as distinguished from classes, for they are a dubious entity at
best. A clear condition under which two attributes may be said to be identi¬
cal is lacking, as Quine and others have repeatedly pointed out.8 Also the
notion of connotation is not too clear. What are “those qualities which are
essential to the class in the sense that the name implies them in its
definition”? What is meant by ‘essential’ here? By ‘implies’? Without
acceptable answers to these questions, we cannot claim to have put forward
a clear account of connotation at all. It has been said that “clarity is not
enough,” but still it is a good deal and surely a prerequisite for any adequate
logical theory of connotation.
It should be noted, incidentally, that Keynes is usually careful to dis¬
tinguish between the use and mention of expressions, a care not shown by
all writers on the subject. But he vacillates between speaking of class-names
and of “properties” or “qualities” or “attributes” or “characteristics,” as
we have noted. Many of his statements concerning class-names are ac¬
ceptable on the basis of modern syntax and semantics. When he speaks of
“attributes” or “characteristics,” in the wake no doubt of Aristotelian es-
sentialism, he is less clear and not always defensible.
Keynes goes on to distinguish two other laws of variation, which, how¬
ever, concern more special notions and are merely supplementary to (K).

Ill

Two further, but very different, formulations of the “law” are to be


found in Cohen and Nagel’s influential An Introduction to Logic and
Scientific Method.9 First:
(A) When a series of terms is arranged in order of subordination, the
extension and intension vary inversely.

And secondly:
(B) If a series of terms is arranged in order of increasing intension, the
denotation of the terms will either remain the same or diminish.
52 1. SCIENCE AND INQUIRY

Cohen and Nagel criticize (A) but seem to condone (B) as a satisfactory
law.
Although these two formulations appear similar to those of Neville
Keynes, actually they are very different. In the first place, by a teia.
Cohen and Nagel do not understand an expression, but rather what th~
expression stands for. Propositions, according to them and following
Aristotle,
either assert or deny something of something else. That about which the assertion
is made is called the subject, and that which is asserted about the subject is
called the predicate. The subject and the predicate [i.e., presumably the things,
not the expressions] are called the terms of the proposition. ... A term may be
viewed in two ways either as a class of objects (which may have only one
member), or as a set of attributes or characteristics which determine the ob¬
jects. The first phase or aspect is called the denotation or extension of the term,
while the second is called the connotation or intension. Thus the extension of
the term “philosopher” is “Socrates,” “Plato,” “Thales,” and the like; its in¬
tension is “lover of wisdom,” “intelligent,” and so on.

It is interesting that Cohen and Nagel do regard the intension as a set


of attributes or characteristics. They are perhaps the first to recognize
explicitly that intensions are sets in some fashion, and this recognition
constitutes an advance in the subject.
By the conventional intension or connotation of a term, Cohen and Nagel
mean, more precisely, “the set of attributes which are essential to it.”
Thus the intension of the attribute P is presumably the set of all attributes
A essential to P. And by “essential,” they say, “we mean the necessary and
sufficient condition [in the singular, notice] for regarding any object as an
element of the term.” This definition is not too clear. Perhaps, according
to it, we may say that attribute Q is essential to attribute P if and only if
necessarily every object which has P has Q and conversely. The intension
of P then becomes the set of all attributes A such that necessarily every
object which has P has A and conversely. But this is not the set intended,
for presumably it would be a set with only one member, namely, P itself.
Perhaps the intension of P here is to be only (i) the set of all attributes A
such that necessarily every object which has P has A. Or perhaps (ii) the
finite set of attributes (Ch, . . . , Q(l} such that necessarily every x which
has P has Qi, has Q2, . . . , has Q„. But all this is a little murky.
For the subordination of intensions Cohen and Nagel have a clear
meaning, for intensions are certain kinds of sets and hence subordination is
merely class- or set-inclusion. Let for the moment ‘int(P)’ designate the in¬
tension of P in whatever sense they intend. If we disregard for the moment
the unnecessary complication of bringing in the notion of a series, their
formulation (A) seems to be:

(A') If P is subordinate to Q, then int(Q) is subordinate to int(P).


R. M. MARTIN 53

By ‘subordinate’ here is meant for the moment properly subordinate or


is properly included in. (B), on the other hand, would read:

(By) If int(Q) is subordinate to int(P), then P is subordinate to or identical


with Q.

We do not claim that (A') and (B') are precisely what Cohen and Nagel
intend by (A) and (B), but only that they are reasonably close approxima¬
tions to what they perhaps intend.
It should be noted that Cohen and Nagel’s formulations are given as
principles within an object-language. For Keynes, however, as we have
observed, these principles are meta-linguistic, more precisely, semantical,
principles. It is not altogether clear what is thought to be gained by
their statement within an object-language. Usually by a term one under¬
stands an expression of such and such a kind. Keynes is explicit on this
point. According to him, “. . . it seems better to start from the names. . . .
Neglect to consider names in . . . connexion [with extension and intension]
has been responsible for much confusion.” Also Carnap and Frege have
emphasized the importance of regarding intensions as intensions of expres¬
sions. In fact, it is now almost universally recognized that the theory of
extension, denotation, connotation, intension, and the like, belongs to
semantics, and hence is to be formulated within a semantical meta¬
language.
There is also the difficulty concerning ‘necessary’ in Cohen and Nagel’s
formulations. Is this to be construed in the sense of some modal operator?
If so, we must then face squarely the need for mixing modal operators and
quantifiers. We do not wish to contend that this cannot be done satisfactorily,
but only that there are grave difficulties here which must not be glossed
over. Quine and others have pointed these out repeatedly. Surely we can¬
not expect a satisfactory formulation in the manner of Cohen and Nagel
of the law or laws of inverse variation until these difficulties are overcome.
Also closely connected with these is the dubious ontology of attributes,
etc., which we have already met in the formulations of Keynes.10
The reader may object that we are being excessively laborious as to logi¬
cal detail. But, to borrow a phrase which Cohen and Nagel themselves
use in another context, “it is necessary to . . . [be] so if we wish to avoid
elementary confusions.” In trying to avoid such we must occasionally be
allowed to refer to more advanced matters in modern logic, syntax, and
semantics, as well as to pay strict attention to what might appear as
minutiae. To refuse to allow this is often to refuse to let a subject grow or
develop beyond its incipient (or textbook) stages. As logic itself, including
now the theory of intensions, has developed enormously within recent
years, it must carry its history along with it, subjecting it to fresh critical
scrutiny in the light of present knowledge.
54 1. SCIENCE AND INQUIRY

IV

Let us reflect now upon what is needed if we are to gain a more adequate
formulation of the law or laws of inverse variation.
In the first place, we must have a well-articulated semantics including
a syntax. Nothing less than this surely is now acceptable. Because the
law or laws are so intimately connected with denotation or designation, the
most natural logical underpinning is no doubt to be found in denotational
or designational semantics.
Next we must have a clear and acceptable notion of what intensions are.
We must have a clear ontic description of them. One searches far and
wide in the literature for this. In the author’s Intension and Decision, a
theory of many kinds of intension was suggested. Certain defects, however,
marred the theory there, but these we shall try to correct. Intensions should
emerge in a natural way, so we should contend, as certain kinds of
entities already available in the underlying denotational semantics. They
are not, in other words, to be regarded as sui generis. The notion of
analytic truth should play a fundamental role, supplanting the somewhat
vague and unsatisfactory traditional use of ‘necessary’ or ‘essential.’ Also
intensions must be dissected into parts (or members) in some fashion,
and are not to be regarded as indivisible wholes. As regards this, there
seem to be two historical lines of research. On the one hand, there is the
German tradition, stemming no doubt from Kant, through Frege to Carnap.
And, on the other, there is the English tradition to which Stuart Mill and
Neville Keynes have contributed notably. The key difference seems to be
that in the German tradition intensions are regarded as sui generis and as
indivisible wholes, whereas in the other they are dissected into components.
On this crucial matter we side with the English. We shall try to be as clear
as possible as to precisely what these components are. Heretofore this has
been left rather vague, as we have seen.
Let us turn now to recent semantical theory, in order to pave the way
for a more sophisticated and (hopefully) adequate formulation of the law
or laws of inverse variation.
To designate is to be a proper name of, whether of an individual, a
class, or a relation. To denote is to be a common name of an individual
and of an individual only. Semantical rules concerning designation and
denotation reflect both actual usage and convention, but are no doubt to be
laid down in part by meta-linguistic fiat. There is a third notion, also of
great semantical interest, that of L-designation. It is perhaps a more
“natural” notion than either designation or denotation, simpler, and more
closely reflecting in some sense actual usage. L-designation has occasionally
R. M. MARTIN 55

been mentioned by Carnap, although very rarely used even by him.11 Also
it is definable, with certain desirable restrictions, in terms of denotation.
Hence an intensional semantics based on it shares the advantages enjoyed
by the semantical meta-languages based upon denotation.
We shall also need to introduce a notion of virtual-class L-designation.
Virtual classes, it will be recalled, are classes manquees, very like real
classes, but lacking the crucial feature of being values for variables in the
formalism at hand.12 The theory that emerges, with virtual classes used
in place of real classes and L-designation in place of designation, represents
an important improvement over the theory put forward in Intension and
Decision.
Classes are treacherous entities. They lead us astray in mathematics, and
hence there is little reason to regard them as suitable tools in philosophy.
In their favor is the circumstance that they are extensional entities and
hence relatively clear. If we reject them, it would not do to reinstate
intensional entities in their place, entities such as properties or class-con¬
cepts, relation-concepts, individual-concepts, or propositions.13 If we are to
reject classes, we must do so toto caelo and reject intensional entities
as well. Rejection here of course consists only of refusal to admit as values
for variables. If we can succeed in defining a suitable notation for such
entities, or somehow succeed in gaining the effect of such, upon an ac¬
ceptable foundation, this is all to the good. In fact, this is just what we hope
to do for the various kinds of intensions.
The philosophic importance attaching to the rejection of intensions as
values for a special kind of variable (intensions sui generis, so to call
them) is multifold. First, there is the distrust of pseudoentities. At best
intensional entities are suspect and should not be multiplied beyond neces¬
sity. Then there is the complexity of the laws governing them. These are
notoriously sticky and more complex than ordinary laws governing ex¬
tensions. Also there is the further complexity of the whole meta-language
in which intensions sui generis can be accommodated. Extensional meta¬
languages are always considerably simpler, both in structure and in what
is assumed. There is also the relevant thesis of the unity of science. Why
should only extensional entities be required in certain parts of science and
then all of a sudden abandoned when we turn to others? It is as though we
were to condone here a fundamental severance similar to the alleged one
between the Natur- and Geisteswissenschaften. Also there is the very
considerable historical confusion that surrounds the subject of intensions.
When properly viewed and analyzed, so we contend, the assumption of
intensions sui generis is seen not only not to have been needed but to have
been positively harmful. The role of intensions, to a large extent at least,
can be played instead by certain kinds of virtual constructs, as we hope
to show.
56 1. SCIENCE AND INQUIRY

As is frequently done, let us presuppose some suitable first-order lan¬


guage-system L as object-language, rich or expressive enough to be of some
philosophic interest. Our task is to construct a semantics for L in as simple
a way as possible, in which a theory of objective intensions for L may be
accommodated. There are no doubt many ways of formulating such a
semantics for L. Let us choose one here, as elsewhere, based on denotation.
The primitive is ‘Den’, it will be recalled, significant in contexts of the form
‘a Den x’, read ‘the expression a denotes the individual x\ We recall also
that, in the underlying syntax, every expression of L is given a structural-
descriptive name, 7p' for ‘(’, ‘tilde’ for and so on. Abundant use will
be made of virtual classes, which, as we have noted, differ from real ones
primarily in not being values for variables. Although the terminology and
notation of Intension and Decision are used here for the most part, famili¬
arity with them, or it, is not presupposed.
Let ‘(—x—)’ be some sentential function of the object-language L
containing ‘x' as its only free variable. The virtual class of all individuals x
such that (—x—) may then be expressed by ‘x3(—x—)’, the ‘3’ being
read ‘such that.’ Any expression of the form ‘x3(—x—)’ will be, then, a
one-place abstract. Let ‘PredConOne o’ express that a is a primitive one-
place predicate constant of L or a one-place abstract containing no free
variables. We shall concern ourselves primarily with one-place predicate
constants, but we can pass on to two-place, three-place, etc., predicate
constants easily as desired.
Governing the primitive ‘Den’ for multiple denotation, we have the fol¬
lowing two rules. First, that for all x, m Den x if and only if (—x—),
where (i) ‘m’ is taken as the structural description of the abstract
‘*3(—x—)’, ‘(—x—)’ being any sentential function of L containing ‘jc’ as
its only free variable, or (ii) ‘m’ is taken as the structural description of a
primitive predicate constant and ‘(—x—)’ consists of that predicate con¬
stant concatenated with ‘x’. And, second, that for all a and for all x, if a
Den x then PredConOne a. These two rules fix the properties of Den.
According to the first, just certain PredConOne’s denote certain objects,
and according to the second, no expressions other than PredConOne’s can
denote.14
In terms of ‘Den’ we may define now virtual-class designation. Thus an
expression a may be said to designate a virtual class X3(—x—) provided a
is a PredConOne and for all x, a Den * if and only if (—x—). Thus, in
symbols,

(Dl) ‘a Des X3(—x—)’ abbreviates ‘(PredConOne a*


(x) (a Den x= (—x—)))’, if (etc.)
R. M. MARTIN 57

Let L contain ‘M’ (for men) and ‘R’ (for rational beings) as
PredConOne’s. According to this definition-schema we may say that ‘M’
designates the virtual class of individuals who are men, ‘r3(Mr • Rx)’
designates the virtual class of individuals who are both men and rational,
and so on. Also a here designates the virtual class X3(—x—), no matter
what notation we use to refer to that class. Thus if X3(—x—) =
*3(- .x. .), then also a Des X3(.. x . .). After all, a rose by any other
name. . . .
We may go on to virtual-class L-designation as follows. An expression a
is said to L-designate a virtual class xs(—x—) if and only if a designates
X3(—x—) and a = m, where in place of ‘m’ we put in the structural-
descriptive name of the abstract ‘x3(—x—)’. As a matter of fact, however,
the clause concerning designation here may be dropped, for it follows from
the other clause by one of the Rules of Denotation. Hence we are left
with the especially simple definition,

(D2) ‘a LDes X3(—x—)’ for ‘a = m’, where (etc.)

This definition, or rather definition-schema, defines ‘LDes’ only in con¬


texts in which ‘a’ is the only free syntactical or expressional variable. The
same is true, note, of the definition-schema for ‘Des’. We have no variables
for virtual classes, only abstracts, and hence we do not define ‘Des’ and
‘LDes’ in contexts wherein the second argument is a variable. Nor would
it be desirable to do such where we are dealing with only virtual classes.
When we say that a LDes X3(—x—), the notation we use to refer to
X3(—x—) is all-important. Here a rose by any other name differs remark¬
ably, because the names themselves differ. L-designation is a matter pri¬
marily of the names, as it were, and only secondarily of what the names
stand for. Thus ‘M’ L-designates X3Mx, but not X3(Mx • Rx), although the
two virtual classes are the same. We see then that L-designation, strictly
speaking, is a notion of syntax. Hence the virtual classes referred to in the
definiendum of (D2) may be called syntactic or nominal virtual classes.
But to say this is misleading. It suggests that we can subdivide the virtual
classes into those which are nominal and those which are not, and this is not
correct. To be nominal is rather a property of the occurrence of a virtual-
class expression. Although we gain the effect of speaking of a virtual class
in the definiendum of (D2), this is merely an illusion, for strictly we are
speaking only of a certain expression which designates, more properly, L-
designates, it.
Although a virtual-class expression occurs in the definiendum of (D2),
no such expression occurs in the definiens. The definiens merely stipulates
that a be of such and such a shape. For a to L-designate is then merely
for it to be of such and such a shape. But, as we have noted, any expression
of such and such a shape does in fact designate such and such a virtual
class, in view of the Rules of Denotation. Hence one and the same virtual
58 1. SCIENCE AND INQUIRY

class is, in a roundabout way, implicitly involved in the definiens as well


as in the definiendum.
We should note that the relation of L-designation here differs remarkably
from that of Carnap. That of Carnap is of course a semantical relation,
not one of syntax—and further, within the theory of meaning or intension
sui generis rather than within the theory of extension or reference. Accord¬
ing to him, and in essentially his own words, an expression a L-designates
an entity in L if and only if it can be shown that a designates that entity
merely by using the semantical rules of L without any reference to facts.15
For Carnap, the entity L-designated may be a real class, whereas here
it can be a virtual one only. On the other hand, suppose we know that
a — m, where in place of ‘m’ we put in the structural description of
lX3(—x—)’. It then follows, merely by the Rules of Denotation, that a
designates X3(—x—), as we have already remarked. This circumstance
seems sufficient to justify referring to the relation here as one of L-designa¬
tion.
Nominal virtual classes might also be referred to as thus-designated
virtual classes. A natural reading of the definiendum of (D2) is ‘the
expression a stands for the virtual class X3(—x—) as thus designated’. We
may then, if we wish, speak of thus-designated virtual classes, although
the terminology is perhaps a bit awkward. Nominal virtual classes might
also be called virtual classes in intension. They are not strictly intensions
but are merely “taken in intension,” as the traditional phrase has it. We
shall see in a moment that they are members of intensions, entering into
their inner constitution, as it were, but are not to be confused with intensions
themselves. Nominal virtual classes may also be thought of as virtual-class
concepts. Also we may speak of occurrences of expressions for nominal
virtual classes as being opaque or indirect rather than direct. At any event,
there is surely enough kinship here to justify these various terminologies.

VI

Let us turn now to the various kinds of objective intensions, essentially


as in Intension and Decision, mutatis mutandis.
An expression a is said to be analytically included in an expression b, in
symbols ‘a Anlytclnc b\ provided the sentence which consists of ‘(x) (’ fol¬
lowed by (or concatenated with) a followed by ‘xD ’ followed by b followed
by ‘x)’, where a and b are one-place predicate constants, is an analytic
sentence of L. Note that here we are merely spelling out the structural
description of the given sentence. Suppose a is the predicate ‘P’ and b ‘Q’.
Then a is analytically included in b if and only if ‘(x) (PxD Qx)’ is an
analytic sentence of L.
Given a one-place predicate constant a, the members of its objective
R. M. MARTIN 59

analytic intension are to be the nominal virtual classes L-designated by one-


place predicate constants b such that a is analytically included in b. We
cannot introduce the notion in just this fashion, but instead say that a
given nominal virtual class X3(—x—) is a member of the objective
analytic intension of an expression a, if and only if a is a one-place predicate
constant of L and there exists a one-place predicate constant b which re¬
designates X3(—x—) and such that a is analytically included in b. In
symbols,

(D3) ‘x3(—x—) e ObjAnlytclnt(a)’ abbreviates


‘(PredConOne a • (Eh) (b LDes X3(—x—) • a Anlytclnc b))’.

And of course the whole definiendum must be regarded as an indivisible


unit. In other words, (D3) provides a contextual definition of the entire
phrase ‘x3(—x—) is a member of the objective analytic intension of
and not of any part of this phrase in isolation. Thus ‘member of is used
here only by proxy, but there is enough similarity with class-membership, or
rather virtual-class-membership, to justify the use. Likewise ‘the analytic
intension of is defined only as embedded in the given kind of context.
Further contexts can be introduced as we go on.
Let us now turn to the classic example, homo animal rationalis. In accord
with this, let ‘M’ be regarded as short for ‘x3(Rx‘Ax)’, where ‘R’ (for
rationals) and ‘A’ (for animals) are likewise PredConOne’s. Let us assume
that ‘R’ and ‘A’ are primitive. As members of the analytic intension of
‘M’ we have then the nominal virtual classes of rationals, of animals, and
of men, together with any other nominal virtual classes in the L-designators
for which ‘M’ is analytically included. Let ‘F’ (featherless) and ‘B’ (bi¬
peds) be further primitive PredConOne’s. Although

‘(x)(MxD (Fx• Bx))’,

which states roughly that every man is a featherless biped, is true in L, it


is not analytically so. Hence the nominal virtual class of featherless bipeds,
as well as that of bipeds and that of featherless objects, cannot be members
of the analytic intension of ‘M’. This is of course as we wish it and as it
should be.
Let us say that a is veridically included in b, in symbols ‘a Verlnc b\ pro¬
vided a and b are both PredConOne’s and the sentence consisting of ‘(x) (’
followed by a followed by ‘xD’ followed by b followed by ‘x)’ is true.
Veridical inclusion is like analytic inclusion, with ‘Tr’ for truth replacing
‘Anlytc’ in the appropriate definientia. Similarly we go on to theoremic
and synthetic (or factual) inclusion, ‘a Thmlnc b’ expresses that a is
theoremically included in b, and la Synthclnc b' that a is synthetically so.
These relations prepare the way for other kinds of intensions.
We may introduce now objective synthetic, veridical, and theoremic in¬
tensions by replacing ‘Anlytclnc’ in the definiens of (D3) by ‘Synthclnc’,
60 1. SCIENCE AND INQUIRY

‘Verlnc’, and ‘Thmlnc’ respectively. It is natural to think that these, to¬


gether with analytic intensions, would constitute the four fundamental
types, one corresponding to the semantical notion of being analytic, one to
that of being synthetic, one to that of being true, and one to that of being
a therorem. We let ‘ObjSynthcInt’, ‘ObjVerlnt’, and ‘ObjThmlnt’ respec¬
tively symbolize these notions.
It should be noted that analytic intensions do intimately depend of
course upon the analytic truths of L. These in turn are merely the logical
truths of L together with the results of abbreviating them by using the
definitions. Nothing more elaborate is involved, so that the notion of
being analytic here is as well-founded as that of being logically true.

VII

It is to be observed that the theory here does not allow that one and the
same nominal virtual class can be a member of both the ObjAnlytcInt(a)
and of the ObjSynthcInt (a). The way in which the member is specified or
referred to makes all the difference. (Hence our concern with nominal
virtual classes.) We may see this as follows.
Let

(i) X3(—x—) e ObjSynthcInt (a)

and let a L-designate some nominal virtual class N. Then N is a subclass


of X3(—x—), and this synthetically. More precisely, we should say here
that there is a b such that a Synthclnc b where b LDes X3(—x—). But
also N is a subclass of the logical sum of N and X3(—x—), i.e., of
X3(Njc v —x—), and this analytically. More precisely, there is a c such
that a Anlytclnc c where c LDes *3(N;c v —x—). Hence we have that

(ii) X3(N;c v —*—) e ObjAnlytcInt(n),

according to (D3). Although xs(—*—) and X3(Nx v —*—) are in fact


the same virtual class, N being a subclass of X3(—x—), they are not one
and the same nominal virtual class. Hence we are not allowed to inter¬
change salva veritate the abstracts for them in either (i) or (ii). The way
in which the member of an objective intension is referred to is crucial.
Strictly the members of an objective intension are merely nominal virtual
classes, not virtual classes simpliciter, as we have noted.
If we regard objective intensions as real classes of real classes, in the
manner of Intension and Decision, the situation just described raises diffi¬
culty. That every member of an analytic intension, in the sense of Intension
and Decision, is also a member of the corresponding veridical one is im¬
mediate. The converse can be established by an argument similar to that
of the preceding paragraph.10 Hence, within that theory there seems to
R. M. MARTIN 61

be no way of distinguishing properly the two kinds of intension. In the


theory just sketched, however, in terms of Den and nominal virtual classes,
there is no such difficulty.
Although the form of the definiens of (D3) is essentially the same as in
the corresponding definition in Intension and Decision, with ‘LDes’ replac¬
ing ‘Des’, the definienda differ, as well as the whole ambiente. In particular,
objective intensions, as now conceived, are merely virtual classes of nominal
virtual classes, no longer real classes of real classes. Hence expressions for
them, as well as for their members, can occur only in suitably defined
contexts. Hence also, strictly speaking, there are no such things as objective
intensions at all. We cannot quantify over them nor over their members.
But it is not clear that we ever need to do this anyhow, as in the case
of virtual classes generally. We can, however, specify an objective inten¬
sion as having such and such specified nominal virtual classes as members.
And this is what, after all, we wish. A meaning or intension is nothing if
not specified. And if it is properly specified in some sense, nothing more
is needed. Thus, although it might seem that something important is
sacrificed in the virtual treatment of intensions, it is by no means clear that
this is the case.
It might be thought that in the theory here extensionality is in some
fashion abandoned. But this is not the case, as we have already in effect
noted. Suitable extensionality laws hold for L-designation. To say that a
LDes X3(—x—) is merely to say that a is of such and such a shape. Just
because X3(—*—) might be the same virtual class as X3(. . x . .) by no
means gives grounds for thinking that a LDes X3(. . x . .) also. That a has
one shape in fact rules out that it can have another. But clearly if a = m
and such and such holds of a, then such and such holds of m likewise
(where in place of ‘m’ we put in the structural description of a PredCon-
One). Also if a = b and a LDes X3(—x—), then b LDes X3(—x—)
also. Finally, if a LDes X3(—x—) and a LDes X3(. .*..), then not
only is X3(—x—) the same virtual class as X3(. . x . but also m = n
(where in place of ‘m’ and ‘n’ we put in respectively the structural de¬
scriptions of ‘*3(—x—)’ and ‘X3(. . x . .)’). Thus we see that suitable
extensionality laws hold for LDes without restriction.
Nominal virtual classes differ from ordinary virtual classes only in a
maniere de parler. There is no difference in ontology—in fact there is no
ontology here at all, for virtual classes strictly do not exist in the sense
of being values for variables. We speak, however, as though there were
such things as nominal virtual classes and thus give the theory of inten¬
sions a kind of ontological flavor. Distinctions in intension thus seem to
reflect an ontological difference, but actually are reducible merely to
differences in the manieres de parler. Strictly there are no such things as
virtual classes, nominal virtual classes, or intensions. They are mere forms
of nonbeing.
62 1. SCIENCE AND INQUIRY

vm
We note that the only contexts thus far introduced in which ‘ObjAnlytcInt
(a)’ may occur significantly are of the form of the definiendum of (D3), in
which a nominal virtual class is said to be a member of the ObjAnlytcInt
(a). And similarly for ‘ObjVerlnt(a)’, ‘ObjSynthcInt(a)’, and ‘ObjThmlnt
(a)\ But further contexts may be introduced definitionally, in particular,
contexts in which we may say that one intension is included in another or
that an intension is identical with an intension.
Various relations of inclusion as between one-place predicate constants
have already been introduced. The following definitions introduce various
relations of inclusion as between objective intensions.

‘ObjAnlytcInt (a) C ObjAnlytcInt (ft)’ abbreviates


‘(PredConOne a • PredConOne b • (c) (a Anlytclnc c D b Anlytclnc c))’,

‘ObjAnlytcInt(a) C ObjVerlnt(b)’ abbreviates


‘(PredConOne a • PredConOne b • (c) (a Anlytclnc c D b Verlnc c))’,

and so on, for all possible cases. There are sixteen in all. And identity for
each case is merely mutual inclusion. Thus

‘ObjAnlytcInt(a) = ObjAnlytcInt(fi)’ abbreviates


‘(ObjAnlytcInt(a) C ObjAnlytcInt(b) • ObjAnlytcInt(b) C
ObjAnlytcInt (a))’,

and so on.
Also we can introduce an existential operator for objective intensions.
We let

‘E!ObjAnlytcInt(a)’ abbreviate ‘(Eb)a Anlytclnc b’,

and so on.
We have immediately existence theorems, to the effect that if a is a
PredConOne, then ElObjAnlytcInt(a), E!ObjVerInt(a), and EIObjThm-
Int(a). For synthetic or factual intensions much depends upon the axioms
of the object-language, which in turn depend upon what the facts are, so
to speak. Ordinarily we should be able to show that the objective synthetic
intension of a PredConOne exists also, but it is not clear that this can
always be done.
We have some interesting inclusional laws for intensions as follows.

Tla. i- PredConOne a D ObjAnlytcInt (a) C ObjVerlnt(a).

(The V’ is to be read ‘is a theorem’ and is to absorb the quotation marks


around the string of symbols following it.)

Tib. h-PredConOne a D ObjThmlnt(a) C ObjVerlnt(a).


R. M. MARTIN 63

Tic. h-PredConOne a D ObjAnlytcInt (a) C ObjThmlnt(a).

Tld. b-PredConOne a D ObjSynthcInt(a) C ObjVerlnt(a).

Let lF\ ‘G’, and ‘H’ now be any PredConOne’s.


That analytic and synthetic intensions are properly distinguished is shown
by the following law.

T2. hf e ObjSynthcInt(a) D — F e ObjAnlytcInt (a).

And that they are properly combinable, by the following.

T3. \-F e ObjVerlnt(a) = (F e ObjAnlytcInt (a) v F e ObjSynthc-


Int(a)).

We are now in a position to state several laws of inverse variation in


an exact and (hopefully) adequate form.

T4a. b- (PredConOne a • PredConOne b) D (ObjAnlytcInt(a) C


ObjAnlytcInt(b) = b Anlytclnc a).

T4b. b- (PredConOne a • PredConOne b) D (ObjVerlnt(a) C Obj-


Verlnt(h) = b Verlnc a).

T4c. (— (PredConOne a • PredConOne b) D (ObjThmlnt(a) C Obj-


Thmlnt(h) = b Thmlnc a).

Also we have several mixed laws of inverse variation as follows.

T5a. h (PredConOne a • PredConOne b • (ObjVerlnt(a) C ObjAnlytc-


Int(h) v ObjThmlnt(a) C ObjAnlytcInt{b)) D b Anlytclnc a.

T5b. i-(PredConOne a • PredConOne b) D ((ObjAnlytcInt(a) C Obj-


Verlnt(h) v ObjThmlnt(a) C ObjVerlnt(h)) = b Verlnc a).

T5c. i- (PredConOne a • PredConOne b • ((ObjAnlytcInt(a) C Obj-


Thmlnt(h) v ObjVerlnt(a) C ObjThmlnt(h)) D b Thmlnc a.

Various relations of equivalence as between PredConOne’s may now be


introduced as mutual inclusions. Let

.‘a AnlytcEquiv b’ abbreviate ‘(a Anlytclnc b • b Anlytclnc a)\

The definiendum reads ‘a is analytically equivalent with b’. Similarly let

‘a SynthcEquiv b’ abbreviate ‘(a Synthclnc b • b Synthclnc a)\

and so on.
As corollaries of T4a-T4c we have the following laws concerning the
various kinds of equivalence.

T6a. i-(PredConOne a • PredConOne b) D (ObjAnlytcInt(a) =


ObjAnlytcInt(b) = a AnlytcEquiv b).
64 1. SCIENCE AND INQUIRY

T6b. I- (PredConOne a • PredConOne b) D (ObjVerlnt(a) = Obj-


Verlnt(b) == a VerEquiv b).

T6c. i- (PredConOne a • PredConOne b) D (ObjThmlnt(a) = Obj-


Thmlnt(6) = a ThmEquiv b).

Let us say that one nominal virtual class is analytically included in another
if the expression which LDes the first is analytically included in the expres¬
sion which LDes the second. Thus

‘F Anlytclnc G’ abbreviates ‘(Ea) (Eb) (a LDes F • b LDes G •


a Anlytclnc b)\

Also we may introduce the relation of analytical equivalence as between


nominal virtual classes. We let

‘F AnlytcEquiv G’ abbreviate ‘(F Anlytclnc G • G Anlytclnc F)’.

We now have some laws of inverse variation involving L-designation.

T7a. b- (PredConOne a • PredConOne b • a LDes F • b LDes G) D


(ObjAnlytcInt(a) C ObjAnlytcInt(6) = G Anlytclnc F),

and hence

T7b. \- (PredConOne a • PredConOne b • a LDes F • b LDes G) D


(ObjAnlytclnt(a) = ObjAnlytcInt(6) = F AnlytcEquiv G).

T8a. i- (PredConOne a • PredConOne b • a LDes F • b LDes G) D


(ObjVerlnt(a) C ObjVerlnt(fe) = F Verlnc G),

and hence

T8b. h- (PredConOne a • PredConOne b -a LDes F* b LDes G) D


(ObjVerlnt(a) = ObjVerInt(6) = F VerEquiv G).

We need not tarry with these technicalia further, the main line of de¬
velopment being clear enough.

IX

We go on now to further kinds of intension, as suggested in Intension


and Decision, refashioning them as needed. In particular, the so-called
Whiteheadian intensions must be handled rather differently. Here we have
no type-theory by way of logical scaffolding and hence must make the
virtual theory suffice. Whiteheadian intensions will have as members nomi¬
nal virtual classes of virtual classes!
The general definition for Whiteheadian analytic intensions is as follows.
Here as above we introduce only contexts in which such and such a nominal
virtual class (of higher type) is said to be a member of a Whiteheadian
R. M. MARTIN 65

analytic intension of a PredConOne. We presuppose that abstracts of


higher type such as

‘G3(—G—)’

have been introduced contextually, and that

‘d LDes G3(—G—)’

has been suitably defined. Then, without more ado,

(D4) ‘G3(—G—) e WhtdAnlytclnt(a)’ is to abbreviate


‘(PredConOne a• (Eb)(b LDes G3(—G—) • Anlytc

(The here is the sign of concatenation, so that (d^a) is the string of


symbols b followed by a).
And similarly for the other kinds of Whiteheadian intension, synthetic,
veridical, and theoremic.
As an example, consider again the defined PredConOne ‘M\ We note
that ‘(x)(Mx D Mx)’, ‘(x)(Mx D (Rx-Ax))’, ‘(x)(Mx D Rx)\ ‘(x)
(Mx D Ax)’, etc., etc., are a few Anlytc’s containing ‘M’. As members of
the Whtd Anly tclnt (a), where a LDes M, we have then the nominal virtual
classes of virtual classes G3(x)(Mx D Gx), G3(x)(Gx D Mx), G3(x)
(Gx D Gx), G3(x)(Gx D (Rx-Ax)), G3(x)(Gx D Rx), G3(x)(Gx
D Ax), etc., etc.
An important point to note is that the various kinds of Whiteheadian in¬
tension may also be introduced for individual constants and Russellian
descriptions of individuals, and this in contrast to the other types of objective
intension mentioned. First let us consider primitive individual constants
and then Russellian descriptions.
Let ‘InCon a’ express that a is a primitive individual constant of L. Then

(D5) ‘x3(—x—) e WhtdAnlytcInt(fl)’ may abbreviate ‘(InCon a-


(Ed) (d LDes x3(—x—) • Anlytc (d^a)))’.

And similarly for the other kinds, synthetic, veridical, and theoremic.
Now for Russellian descriptions, taken as primitives of L. Let ‘inviota’
be the structural-descriptive name of the inverted iota V, and ‘ex’ of ‘x’.
If c is the structural-descriptive name of ‘(—x—)’ then i{inviota'~'ex'^cy is
the structural description of ‘(-)X.(—x—) )’. Likewise let ‘invep’ be the
structural description of the inverted epsilon ‘3\ And finally let ‘SentFunc-
One n,d’ express that a is a sentential function of L containing d as its only
free variable. Then we may let

(D6) ‘x3(—x—) e WhtdAnlytcInt(a)’ abbreviate


‘(Ed) (Ec) (d LDes x3(—x—) • SentFuncOne c,ex • a = (inviota^ ex^ c) •
(ex'~'invep<'~'c) Anlytclnc d)’.

(Actually this definition should be generalized to allow for variables other


66 1. SCIENCE AND INQUIRY

than ex.) (D5) and the generalization of (D6) together would provide a
definition of WhtdAnlytcInt’s for all individual expressions of L.17
Enough has been said surely to indicate how the theory of Whiteheadian
intensions could be further developed.
That intensions turn out to be merely virtual lends further support to
the theory of virtual classes. And that intensions may be introduced within
the semantical meta-language based on ‘Den’ provides further ground for
thinking that this meta-language is adequate for philosophical purposes.
This meta-language, or kind of meta-language, enjoys many advantages
over its alternatives, as we have noted. To have used a meta-lanuage of
higher order for the theory of intensions, as in Intension and Decision, was
a step backward anyhow. To have employed essentially the whole of
mathematics (contained in such a meta-language) merely to develop so
essentially simple a theory as the semantical theory of intensions is to have
employed too much. The notions of intension are not notions of mathematics
and it is an anomaly to have supposed that they were. (By ‘mathematics’
here we mean, barring niceties, the theory of real classes and/or sets.) The
inadequacy of the treatment in Intension and Decision is in part to have
made too much of real classes rather than to have relied on their virtual
counterparts. Here as elsewhere real classes lead us astray, virtual ones, on
the other hand, being a sure guide.

It will be of interest to compare very roughly the foregoing theory with


some suggestions of Russell, with whom there is welcome kinship.
In his famous paper “On Denoting” of 1905, Russell makes the key
point that “when we wish to speak about the meaning of a denoting phrase,
as opposed to its denotation, the natural mode of doing so is by inverted
commas.”18 Let C be some denoting phrase. Russell is interested in the
relation between C and ‘C’. “. . . [Wjhen C occurs,” he says, “it is the
denotation that we are speaking about; but when ‘C’ occurs, it is the mean¬
ingWe need not tarry over the details of Russell’s discussion, which con¬
tains flashes of great insight midst a morass of confusion. The key point,
which he seems one of the first to have noted, is that discussion of meaning
should take place in terms of discussion of signs. The meaning of a sign is
thus to be found, in part at least, in its mention. Once this is granted, one
can part from Russell widely on matters of detail.
Russell differs here remarkably from Frege. For Frege the discussion of
meaning is to take place in terms of abstract postulated entities such as
Sinne or senses. Whatever Russell wishes to say about meaning, however,
is to be said in terms of the signs. This view is echoed somewhat in the
R. M. MARTIN 67

foregoing use of nominal virtual classes. But because the theory of in¬
tensions above presupposes the notion of analytic truth, it is essentially a
semantical theory, as with Frege, even though the semantical presupposi¬
tions involved differ remarkably.
In the theory put forward above, of course, ‘denotation’ is taken in a
very different sense from that of Russell. Here only primitive or defined
predicate constants denote. Russell, on the other hand, in addition to
descriptive phrases of the form ‘the one so and so,’ regards phrases such
as ‘a man,’ ‘all men,’ ‘some man,’ and so on, as denoting. Apparently
he had not, in 1905, straightened out completely the proper use of the
quantifiers. Once this is done, what is left of Russell’s theory concerns only
descriptions.
Descriptive phrases have been explicitly introduced in the theory of in¬
tensions above. There are of course alternative methods of doing this. Much
depends upon whether descriptions are taken as primitive or defined in
Russell’s way. If the latter, there is then no need for a discussion of their
meaning. If, on the other hand, they are taken as primitives, as with Frege
or Carnap, and above, then of course there is. The distinction between
primitive and defined signs in general is of the utmost methodological im¬
portance. Yet curiously little attention is paid to it by methodologists or
analytic philosophers. Doubtless the reason is that the distinction seems
irrelevant for natural language. We do not worry about what the primitives
of a natural language are, although it is no doubt time that we did. But
for logic, the discussion of language-systems, syntax, semantics, and the
like, the distinction cannot be disregarded with impunity.
Russell is never too clear as to precisely what a meaning is. To say
merely that “when ‘C’ occurs, it is the meaning” we are speaking about, or
that “the natural mode” of speaking about the meaning is by inverted com¬
mas, is not sufficient. Nonetheless, in these passages Russell is careful to
distinguish between use and mention, a care unfortunately he does not
always show. Also he is careful here to distinguish between denotation and
meaning, a care likewise which unfortunately he does not always show. In
the theory above, meanings or intensions are explicitly identified with
certain kinds of virtual classes, and thus given an explicit ontological
status.
Let us tarry with Russellian descriptions a little, particularly as concerns
Frege’s celebrated example of ‘the morning star’ and ‘the evening star,’
so much discussed in the literature. These two phrases presumably stand for
or designate the same object, but should have different meanings or analytic
intensions.19
Within L, let ‘Sx’ for the moment read ‘x is a star’, ‘Mx’, ‘x shines in the
morning’, and ‘Ex’, ‘x shines in the evening’. As is customary we disregard
for the moment all astronomical matters, all matters concerned with time,
68 1. SCIENCE AND INQUIRY

as well as more precise characterizations of ‘M,’ ‘E,’ and ‘S.’ Let Russellian
descriptions be regarded as primitive terms of L. Let ‘ms and es be dis¬
tinct, defined individual constants, introduced in such a way that

‘ms = (■) • Sx))’

and
‘es = (-jx.(Ex • Sx))’

are analytic sentences of L. The Whiteheadian analytic intensions of ms


and ‘es’ then differ, as we wish them to intuitively, because we can find a
nominal member of the one which is not a member of the other. Now
clearly ‘(x)((Mx • Sx) D (Mx-Sx))’ is analytic, whereas ‘(x)((Mr-
Sx) D (Ex • Sx))’, although true in astronomy, is not. Elence

X3(Mx • Sx) e WhtdAnlytcInt(‘ms’)

but
— X3(Mx • Sx) e WhtdAnlytcInt(‘es’).

And similarly for other cases. If, on the other hand, ‘ms’ and ‘es’ are
primitive individual constants, and where ‘ms = es’ is presumably synthet¬
ically true, then clearly

X3(x = ms) e WhtdAnlytcInt(‘ms’)

but
~X3 (x = ms) e WhtdAnlytcInt(‘es’).

XI

Let us turn now to the relation of connoting.20


The objective analytic intension of a PredConOne may be regarded as,
if one wishes, its connotation. Surely there is enough agreement with the
traditional notion to justify the terminology. Traditionally English logicians
have preferred ‘connotation’ to the more Germanic ‘intension.’ At any
event, it is convenient to speak here of a relation of connoting, correlative
with that of denoting. Actually the matter is very simple, for we may im¬
mediately introduce

‘a Con F’ for ‘F e ObjAnlytcInt(a)’.

The definiendum may read ‘a connotes the nominal virtual class F’.
More interesting is the notion of virtual attribute which emerges from the
foregoing. For this we need the relation of designation for individuals,
which has not thus far been introduced. Let

‘a Des x’ abbreviate ‘(InCon a - (ex'~'invep'~'ex'~'id'~'a) Den x)’.


R. M. MARTIN 69

Designation of individuals thus reduces in effect to a kind of virtual-class


denotation.21
To say now that an object x has the attribute associated with a Pred-
ConOne is to say that x is a member of every nominal virtual class in the
ObjAnlytcInt(a). This notion is not to be defined in this fashion, but rather
as follows.

‘x Att a’ abbreviates ‘(PredConOne a - (Eb)(b Des x •


(c) ((PredConOne c • a Anlytclnc c) D Tr (c^fc))))’.

Thus, roughly, an individual x has the attribute associated with a given


PredConOne, a, if x is truly a member of every virtual class in which the
virtual class which a designates is analytically included.
Note that this roundabout definition achieves what we wish, lacking as
we do quantifiers over nominal virtual classes. Note also that variables
over attributes are not available. Attributes are associated rather with
PredConOne’s, one attribute corresponding with many, but analytically
equivalent, PredConOne’s. This association is reminiscent in some respects
of Fitch’s conception.22
We need not tarry further here with ‘Con’ and ‘Att,’ for most of what is
said in “On Connotation and Attribute” now holds of these notions. Note
again that the basic change is merely that of using L-designation in place
of designation and nominal virtual classes in place of real ones.
An alternative notion of attribute suggests itself if we regard the at¬
tribute corresponding to a PredConOne a as the nominal virtual class L-
designated by a. The attribute is then merely the virtual-class concept, or
the virtual class taken in intension. This rendering is clearly simpler than
the foregoing, and perhaps more natural. Given any virtual class, there
are then as many attributes corresponding to it as there are distinct PredCon¬
One’s designating it. But of course some of these are equivalent with each
other, or L-equivalent, etc.

XII

Carnap conveniently distinguishes between the intension of a term and


its sense.23 Intensions are related to analytic equivalence similar to the way
in which senses are related to synonymy. Two terms which are analytically
(or L-) equivalent have the same intension, just as two terms which are
synonymous presumably have the same sense. Just what senses are, how¬
ever, to say nothing of synonymy, remains rather obscure. The following
suggestions toward providing an ontology for senses may not be without
interest.
Let us say, for the moment, that a PredConOne a is synonymous with
a PredConOne b, provided they are analytically equivalent and every
70 1. SCIENCE AND INQUIRY

primitive predicate or individual constant which occurs in one occurs in


the other with their left-to-right order preserved.
The three conditions seem reasonable. That analytic equivalence is a
necessary condition for synonymy scarcely needs comment. That the two
expressions contain precisely the same nonlogical constants rules out ex¬
traneous subject matter referred to in one and not in the other (even if
only in analytic contexts).24 For example, 'x3Mx' is not synonymous with
‘x3(Mx • (Rjc v ~ Rx))’, there being (in some sense) reference to R’s in
the second but not in the first. The preservation of the left-to-right order of
occurrence is also desirable, assuring that the “movement of thought,” as it
were, is the same for the two expressions. Thus ‘x3(Mx • Rx)’ is not sy¬
nonymous here with lX3(Rx • Mx)’.25 The emphasis is somehow different,
it might be maintained, and this difference should be taken account of.
Perhaps there are further conditions to be added. For example, we might
wish to insist that the association within an abstract be the same, so that
lX3(Mx • (Rx • Fx))’ would not be regarded as synonymous with lX3( (Mx •
Rx) *Fx)’, Here too there is a difference in the “movement of thought,”
it might be maintained, something more than a mere notational or logical
difference.
Suppose for the moment that a satisfactory definition of synonymy were
forthcoming in some fashion. Let ‘a Syn b’ express then that a and b are
synonymous PredConOne’s. What then is a sense?
We have spoken only of synonymy, which arises out of analytical
equivalence by suitable restrictions. What relation do we get by appending
these restrictions rather to the definition of analytical inclusion? Let us
call it, for want of a better phrase, ‘analytic entailment'. Then synonymy is
mutual analytic entailment. Let ‘a AnlytcEnt b’ express that the PredCon-
One a analytically entails the PredConOne b.
It seems natural now to think of a sense as arising from analytic entail¬
ment similar to the way in which an objective analytic intension arises
from analytic inclusion. Thus, we might let

(D6) ‘F e Sns(a)’ abbreviate ‘(PredConOne a •


(Eh) (h LDes F • a AnlytcEnt b))’.

Again, this definition is put forward merely as a suggestion, worthy perhaps


of further reflection. Other contexts involving ‘Sns’ can presumably be in¬
troduced in the fashion of § VIII above. Also notions of Whiteheadian
sense may be defined, by suitable rephrasing of (D4).

XIII

The foregoing remarks provide a fairly complete sketch of an improved


first-order theory of intensions, which is to supplant that of Chapter V of
R. M. MARTIN 71

Intension and Decision. Although the key definition there, that of ‘Obj-
Anlytclnt’, was faulty and threw the whole chapter out of focus, we see
now its value as a heuristic, using L-designation in place of designation and
nominal virtual classes in place of real ones.26
We should emphasize again the extreme simplicity of the theory. No
complicated notions are used anywhere, in particular no notions depending
upon higher-order quantifiers or in any way upon the mathematical theory
of sets. This very simplicity is thought to recommend it for philosophic and
linguistic use. We may conveniently distinguish elementary (or first-order)
from theoretical semantics, just as mathematicians frequently distinguish
between elementary and analytic number theory. In each case, the ele¬
mentary presupposes usually only a first-order logic. Theoretical semantics,
comprising the so-called theory of “models,” presupposes either a higher-
order logic or some suitable set theory. In elementary semantics such high-
powered tools are eschewed. Just as it is of interest in number theory to
use only elementary procedures where possible, so in semantics. We have
attempted to show in this paper that elementary procedures suffice for a
semantical theory of intensions, and this heretofore, if we mistake not, has
not been thought possible.
Attention has been confined here wholly to PredConOne’s. But we can
go on to two-place, etc., predicate constants, constructing a theory of in¬
tensions for such by analogy with the foregoing. Perhaps here, as in rela¬
tion theory generally, some interesting intrinsically relational notions would
emerge, transcending the analogy with PredConOne’s.
To summarize. The theory sketched in this paper incorporates the fol¬
lowing features. (1) We are concerned with analyzed intensions, not in¬
tensions sui generis, and in this analysis a multitude of kinds emerges not
distinguished by advocates of the latter. (2) The theory is given within
a first-order semantical meta-language for any first-order system L. (3)
Intensions are given only virtual existence, and at that only nominally so.
(4) There is kinship with Russell in the suggestion that meanings are to be
found primarily in the mention of expressions. (5) Intensions are dissected
into parts or members, so that one can speak of such and such a nominal
virtual class as being a member of such and such an intension. Such modes
of speaking are apparently not available for intensions sui generis, usually
regarded as in some sense indivisible or undissectible wholes. (6) Several
laws of inverse variation are given in what is (hopefully) a satisfactory
way. It is not clear that these laws have ever heretofore been adequately
stated. (7) Suitable notions of connotation and attribute are forthcoming
within the meta-languages discussed. (8) Some tentative suggestions are
put forward toward characterizing notions of sense and synonymy.
Logic advances by slow, minute steps, but these we must not be afraid
of, or disregard or condemn, if we wish to move forward and avoid
“elementary confusions.” The truth, it is said, is hid behind seventy thou-
72 1. SCIENCE AND INQUIRY

sand veils. If we succeed in lifting but one, we may consider ourselves


fortunate.

NOTES
1. This paper is an improved and expanded version of “An Improvement in the
Theory of Intensions,” Philosophical Studies XVII (1967): 33-38. Also some para-
graphs are borrowed from “On the Inverse Law of Extension and Intension,’
Memorias del XIII Congreso Internacional di Filosofla, Mexico City, 1962, with the
kind permission of the editors.
2. The Principles of Logic, 2nd ed. (London, Oxford University Press: 1922),
Vol. II, p. 486.
3. Op. cit., Vol. I, p. 170.
4. See in particular B. Bosanquet, Logic, 2nd ed. (London, Oxford University
Press: 1911), Vol. I, pp. 64 and passim.
5. (London, Macmillan: 1906), pp. 35 ff. This book, termed “excellent” by
C. S. Peirce (Collected Papers, 3.384), has been called “the most perfect presenta¬
tion of classical formal logic in general . . . [which] has been equally great and
beneficent in its influence within Anglo-Saxon civilization” by Heinrich Scholz, in
his Concise History of Logic, tr. by K. Leidecker (New York, Philosophical
Library: 1961), p. 48.
6. Cf. the author’s Truth and Denotation (Chicago, University of Chicago Press:
1958), p. 104.
7. Cf., however, the author’s Intension and Decision (New York, Prentice-Hall:
1963), Chapter II-IV.
8. See, for example, From a Logical Point of View (Cambridge, Harvard Uni¬
versity Press: 1953) and Word and Object (Cambridge, The Technology Press of
the Massachusetts Institute of Technology and New York and London, John Wiley
and Sons: 1960).
9. (New York, Harcourt, Brace and Co.: 1934), pp. 30-33.
10. Further, the Cohen and Nagel laws, if such they be, appear to presuppose
fragments at least of some kind of modal set theory, which, it is to be feared, had
not been adequately formulated at the time. Cf., however, F. B. Fitch, “A Complete
and Consistent Modal Set Theory,” The Journal of Symbolic Logic 32 (1967):
93-103.
11. See esp. R. Carnap, Introduction to Semantics (Cambridge, Harvard Univer¬
sity Press: 1942), p. 81 and Meaning and Necessity (Chicago, University of Chicago
Press: 1946), p. 163.
12. See the author’s “The Philosophic Import of Virtual Classes,” The Journal
of Philosophy LXI (1964): 377-387. Cf. also W. V. Quine, Set Theory and Its
Logic (Cambridge, The Belknap Press of Harvard University Press: 1963), pp. 15 ff.
13. See especially Meaning and Necessity, passim.
14. See, e.g., Truth and Denotation, pp. 99 ff., or Intension and Decision, p. 14.
15. Meaning and Necessity, p. 163.
16. For essentially this argument the author is indebted to Mr. Evan Jobe. See
his “R. M. Martin’s System of Pragmatics,” Methodos XV (1963): 313-330, and
his “A Note on Connotation and Attribute,” The Journal of Philosophy LXI1
(1965): 325-328.
17. Cf. the author’s “On Proper Names and Frege’s Darstellungsweise," The
Monist 51 (1967): 1-8.
18. B. Russell, Logic and Knowledge (London, Allen and Unwin: 1956), pp
39-56.
19. The following paragraph corrects pp. 124-125 of Intension and Decision.
20. Cf. the author’s “On Connotation and Attribute,” The Journal of Philosophy
LXI (1964): 711-724.
21. Cf. Truth and Denotation, pp. 213 ff. or Intension and Decision, p. 35. ‘invep’
here as above is the structural description of ‘ 3’, ‘ex’ of ‘x’, ‘id’ of ‘ O’, and ‘--’is the
sign of concatenation. Cf. Truth and Denotation, Chapter III.
22. Cf. F. B. Fitch, “Attribute and Class,” in Philosophic Thought in France and
the United States, ed. by Marvin Farber (Buffalo, University of Buffalo Publications
in Philosophy: 1950), pp. 545-563.
23. See especially The Philosophy of Rudolf Carnap (in The Library of Living
Philosophers, ed. by P. A. Schilpp, LaSalle, Ill., Open Court: 1963), pp. 897 ff.
R. M. MARTIN 73

24. This suggestion is due, if we mistake not, to C. I. Lewis.


25. Carnap queries essentially this point in The Philosophy of Rudolf Carnap,
p. 899.
26. The author wishes to thank Messrs. Hendry, Jobe, and Montague for their
criticisms, which have helped pave the way for this improved formulation. Cf.
Herbert E. Hendry, “Professor Martin’s Intensions,” The Journal of Philosophy
LXII (1965): 432-434 and Richard Montague’s review of Intension and Decision,
The Journal of Symbolic Logic 31 (1966): 98-102.
CONFIRMATION OF LAWS
Mary Hesse

1. INTRODUCTION

1.1. As early as 1939, in his perceptive essay in the International Ency¬


clopedia of Unified Science,1 Professor Ernest Nagel raised three funda¬
mental objections to the idea of a quantified confirmation theory, all of
which have occasioned much subsequent discussion, and none of which
has yet been satisfactorily overcome. These objections concerned the con¬
firmatory relevance of variety of evidence, the problem of comparison of
theories by means of degrees of confirmation, and the problem of the
relation between confirmation and acceptability. More recently, in the
Carnap volume in the Library of Living Philosophers,2 Nagel has returned
to the attack with a variety of cogent objections to Carnap’s developed
confirmation theory, almost all of which has appeared since the essay of
1939, but in which Nagel does not find satisfactory solutions of the difficul¬
ties adumbrated then nor of several others which he now develops. In
tribute, then, to Ernest Nagel, whose influence in the philosophy of science
has been felt in so many fields, I should like to take up some of these
fundamental objections in the light of recent discussion of induction and
confirmation.3
The problem of explicating induction in terms of some sort of confirma¬
tion theory has reentered the philosophical scene during the past few years,
with some attempts to develop new and more satisfactory probabilistic
confirmation functions of the Carnap type,4 and other attempts to develop
nonprobabilistic functions as alone adequate to the requirements of a con¬
firmation theory.5 The problem is more than a merely technical one, since
the attempt to explicate induction requires considerable preformal discus¬
sion, which inevitably reacts upon our intuitive understanding of induction
itself and raises some fundamental issues about the nature of inference to
laws and theories in science. Here I want to consider in particular two
crucial problems for probabilistic confirmation theory: (1) the fact that in
most such theories, the confirmation of universal generalizations in infinite
domains turns out to be zero; and, (2) the claim that probability cannot
be relevant to the confirmation of laws since the logic of laws must be
74
MARY HESSE 75

stronger than that of universal generalizations, and hence probability theory


cannot take account of this logic. I shall suggest that a solution to these
difficulties is possible if we adopt an attitude to induction which can best be
described by reference to Mill’s induction from particulars to particulars,
as opposed to the more usual formulation of induction from particulars to
general laws and theories.
Let me first explain briefly what I mean by a confirmation theory. I
shall assume that the aim of an explication of induction is to find a
numerical or comparative function c(h,e)—the “confirmation of hypothesis
h given total evidence e”—which is a syntactic function of the statements
h and e, and which has a high or comparatively high value in those cases
where normally accepted or intuitive inductive methods would direct us
to accept hypothesis h on total evidence e, at least in comparison with other
hypotheses having lower c-value. I shall further assume that c(h,e) satisfies
the usual axioms for a probability function, and that therefore, if the appro¬
priate prior probabilities exist and are finite, c(h,e) satisfies Bayes theorem:

c(h,e) = c0(h.e) /c0(e)

This assumption is, of course, controversial, and I cannot pursue that argu¬
ment here, but shall be content to assume that there are strong arguments
in favor of some probabilistic c-theory, but that the particular c-theory
which is adopted (that is, the particular assignment of prior probabilities)
will depend on the best way that can be found of satisfying criteria of
adequacy derived from our intuitive use of inductive methods. I do not
wish to claim, as Carnap appeared to claim in Logical Foundations of
Probability, that confirmation measures a unique objective relation between
hypothesis and evidence, but rather (which Carnap also claims) that it is
a measure of the reasonable degree of belief of scientists in the hypothesis
on the basis of the total available evidence. This degree of belief is not to
be understood to be established directly by consulting explicit pronounce¬
ments of scientists or by conducting opinion polls, but indirectly by ex¬
amining what is presupposed by their generally agreed inductive behavior.
The project of developing a c-theory may then be likened to that of setting
up a scientific hypothesis whose statements are not all verified directly, but
by their consequences. Prior c-values, like “theoretical statements” need
not be given a direct interpretation, but are tested by comparing the con¬
sequences of the c-theory with generally agreed inductive behavior.

2. CONFIRMATION OF UNIVERSAL GENERALIZATIONS

2.1. One type of problem concerned with the confirmation of laws turns
partly on technical issues within the syntax of given confirmation theories
76 1. SCIENCE AND INQUIRY

and partly on more fundamental philosophical principles. This is the fact


that in Carnap’s early theories and in most obvious modifications of them,
the prior confirmation of universal generalizations in an infinite universe is
zero, and that therefore this confirmation cannot be greater than zero for
any finite amount of evidence, however large. This appears, of course, ex¬
tremely counterintuitive because it means we can never have better grounds
for asserting “All A’s are B” on the evidence of a large sample of A’s which
are B than we have on no evidence at all, or, worse, on the evidence of a
large sample of A’s which are not B, since the confirmation of a false asser¬
tion is also zero.
In parenthesis it is worth pointing out that this result is not confined
to generalizations of the “All A’s are B” type (call this type (U)), but
applies also to what may be called universal statistical generalizations, that
is to say, to such assertions as “p% of A’s are B,” for given p; or even to
the weaker assertion that “p% of A’s are B” where p lies in the given in¬
terval 61 to do. That this is so can be seen by considering how such a uni¬
versal statistical generalization is to be understood. If it is interpreted in
terms of either logical ranges or long-run frequencies it must be equivalent
to some assertion of the form “In all sufficiently large samples of finite size
the proportion of A’s which are B is near to p%.” If on the other hand it is
interpreted in terms of a propensity which is a dispositional property of
individual A’s, it is equivalent to “All A’s have a propensity of p% to be
B.” Both these assertions have the same logical form as (U), and hence
both obtain zero confirmation under the same circumstances as (U).
Various methods of obtaining finite confirmation of laws have been sug¬
gested, but none is entirely satisfactory. In early work in the field, Keynes,
Nicod, and von Wright gave preference in assignment of prior probabilities
to laws of type (U), and Jeffreys has argued that such laws should be
ordered in respect of simplicity and the simplest given highest prior prob¬
ability. These suggested solutions, however, demand postulates as strong
and nonobvious as the result which is sought. Hintikka has recently devel¬
oped a probabilistic confirmation function which gives finite confirmation to
universal generalizations, but his system is unsatisfactory in other respects
since it is not capable without further ad hoc additions of dealing with
statistical generalizations, and it does not deal with analogy arguments,
which, as I shall maintain later, are an essential feature of theoretical in¬
ference. Carnap has recently stated6 that he has found a family of c-func-
tions within the rather minimal postulates of his own theory which do allow
universal generalizations to be confirmed, but he evidently considers that
they are too complex to be plausible and has not published the work on
which they are based.
Other ways out of the zero-confirmation difficulty have been proposed.
The earliest within Carnap’s theory was Carnap’s own suggestion of re¬
placing the confirmation of a universal law on given evidence by its
MARY HESSE 77

instance confirmation, defined as the confirmation of the hypothesis that


the next instance would fulfill the law on the same evidence, or its qualified
instance confirmation, defined as the confirmation of the hypothesis that
the next instance satisfying the antecedent of the law would also satisfy
its consequent (that is, that the next raven would be black). This proposal
places us in the following dilemma: If, with Carnap, we regard instance
confirmation as a new definition of the “confirmation” of the original law,
then this new confirmation does not satisfy the probability axioms (as has
been pointed out by Popper)7 and therefore loses the backing of all those
general arguments which show that a satisfactory explication of confirmation
ought to be a probability function. On the other hand, there is open to us
the alternative course of admitting that we cannot have nonzero confirma¬
tion of a universal law but only of a hypothesis about the next instance (or
the next finitely many instances). But then we appear to have abandoned
the original problem as insoluble. It should be noticed that the further
objections of Popper to instance confirmation in the paper just cited are all
objections to the first course and not to the second.
In similar vein, Darlington8 has proposed to save the confirmation of
universal laws by a reinterpretation of their import and a use of interval
estimation in place of Carnap’s inverse probability. He takes the law “All
A’s are B” to be equivalent to the statistical generalization “A proportion
p of A’s are B,” where p comes as near to unity as is desired. He then
adopts, as usual in interval estimation, a confidence level a, (say .99), such
that, if the probability of a hypothesis is greater than a, the hypothesis is
acceptable, otherwise not. That is to say, our intuitions with regard to the
confirmation of a law on favorable evidence will be satisfied if we can
make c (h,e) a. Now we can attain this value if we abandon the attempt
to confirm the law with p = 1, and instead replace this law by

h = “Proportion p of A’s are B”,

where <9(a,s) — p — 1, and s is the size of the observed sample. Then


c(h,e) = a, and a and s determine 8, the lower limit of p, where 8 can be
made as close to 1 as desired by increasing s. 8 cannot be made equal to 1,
however, unless a = 0, and, of course, as we have already seen, a cannot
be put equal to 1 for any value or interval of values of p.
2.2. I shall argue later for a reinterpretation of the import of universal
laws similar to that of Darlington and to that involved in the second way of
taking Carnap’s instance confirmation. Both these approaches imply that
it is after all unnecessary to consider strictly universal laws, because intui¬
tively satisfactory confirmation values adequate for all our needs can be
obtained without them. Let us then first ask whether the finite confirmation
of laws is really required by the scientist’s attitude to a universal generaliza¬
tion. Expressing (U) now in the form (x) (Ax D Bx), where the domain
of universal quantification is infinite, the problem is to show that zero con-
78 1. SCIENCE AND INQUIRY

firmation of (U) does after all correspond to our intuitions, and that
wherever a generalization of form (U) in an infinite domain appears to be
required by scientific inference, it can be replaced without loss by a generali¬
zation in a finite domain, whose confirmation is nonzero.
Let us begin to attack this by asking what would be implied by ascribing
to (U) a nonzero confirmation? It would mean, surely, that it was reason¬
able to believe that (U) has some chance, however small, of remaining part
of the corpus of acceptable scientific knowledge indefinitely and under all
empirically possible circumstances, that is to say, under all circumstances
which can in fact obtain in nature. It is not obvious that it is reasonable
to believe this. This is not so much because such a belief covers an in¬
finite number of instances but rather because it may not be reasonable to
believe that (U) states a law accurately even in one instance. Qualifications
may have to be introduced; for example, A may in all observed and in al¬
most all as yet unobserved instances co-occur with C, a predicate which we
have not included in our description and perhaps have not even noticed.
In that case we may, if we pursue science long enough, be forced to modify
(U) to (x)(Ax.Cx D • Bx) because we have found instances which are A
but neither C nor B. Or, we may find that some A’s are not B, but are
predicated by a closely similar B' which was not previously distinguished
from B, as when a law containing metric predicates is found to hold not
strictly but only approximately. Few of the laws which have been cited as
typical by logicians have in fact remained unmodified in some such ways.
Consider for example Galileo’s law of falling bodies, the law that planets
move in ellipses, Boyle’s law, Newton’s law of gravitation, the chemical
law of constant proportions, and Mendel’s laws of inheritance. All of these
can be expressed in the form “For all bodies, if a body is A it is also B,” and
in this form all have either been falsified or shown to be true only for a
finite number of instances, in which form they may have of course nonzero
confirmation. Even an apparently hard case such as “All molecules of
water consist of two hydrogen atoms and one oxygen atom” has already
had to be modified to “All molecules of water (other than heavy water)
consist of two (nonheavy) hydrogen atoms and one oxygen atom.” This
means that refusal to ascribe nonzero confirmation to “All water molecules
consist of two hydrogen atoms and one oxygen atom” in more than a finite
number of instances could not have eliminated any useful inferences in the
past since the law has in fact turned out to be true only in limited domains.
Is it reasonable to suppose that any lawlike generalization of the form
(U) in current or in any future science will remain forever unqualifiedly
true in every instance? If not, then we are committed to the belief that no
assertion such as (U) has any chance of being strictly true, and therefore
that its confirmation is always, properly, zero.
Some of the examples given above may, however, seem to involve not
so much infinite domains of discrete individuals as continua of individuals,
MARY HESSE 79

for example, space and time points. And even on the very weak assumption
that in every interval of such individuals there is only a countable infinity
of points (the rational divisions which we could in principle measure),
then every generalization over such an interval is an infinite generalization
in the sense considered here, and has zero confirmation. Can we do with¬
out so modest a requirement in our domain of individuals? Consider the
generalization

(s) (Es D Fs), s0 ^ s ^ Si,

where s0 to Si is an interval of at least rational numbers. Let us call this


generalization (V). An example would be: “For all spatial points, the
electric force E implies a mechanical force F,” where, to ensure that this
is an empirical law, E is defined in terms of charges and distances and
not in terms of mechanical forces. To say that this has confirmation zero
is to say that it is unreasonable to believe that it has any chance of being
true at all rational points in the interval s0 to sx. As in the case of generaliza¬
tions over infinite domains of discrete objects, let us first consider what it
would mean to believe that it has some chance of being true at all such
points. This would mean that however closely we peer at the interval su to Si,
taking smaller and smaller intervals within it, still we shall find that Es
implies Fs. Now it is of course not possible operationally to investigate
this assertion for more than a finite number of subintervals, but it is
dangerous to assume that what cannot operationally be demonstrated to be
true is either eliminable from science or false. I certainly do not want to
make any such assumption. We can peer more and more closely at
diminishing spatiotemporal intervals by means of theoretical as well as
operational instruments.
Let us rather proceed as we did in the case of infinite domains of objects
and investigate the truth of (V) not for all points but for any given point,
say s,. As we consider smaller and smaller intervals converging on Sj, it is
surely unreasonable to believe that the theory in terms of which the predi¬
cates E and F are understood will remain accurately true. Small-scale
heterogeneities may arise, as when a macroscopically continous fluid is
found to break up in the small scale into discrete molecules, or the theory
may be found altogether inadequate in the small scale, as when quantum
and nuclear theories superseded classical physics. If we believe that this
process has in principle no end, then we shall believe that putative laws
such as (V) have indeed zero confirmation, and that the laws which we in
fact make use of in inferences involve only quantification over finite sets
of intervals, where these can be made as small as is required. Thus, I
conclude, generalizations of type (V) can be made amenable to confirma¬
tion theory for the same reasons and under the same conditions as those
of type (U) previously discussed.
Let us pause here and try to bring out the significance of these arguments.
80 1. SCIENCE AND INQUIRY

The emphasis has not been laid on the impossibility of knowing of an


infinite number of instances of A’s whether all are B, but rather on the
impossibility of knowing in any single instance that all relevant predicates
have been taken into account. This might be expressed otherwise by saying
that the emphasis is not so much on enumerative as on eliminative induc¬
tion, for the way to investigate whether all relevant predicates have been
taken into account is to investigate variety of instances where many different
sets of predicates accompany A, rather than merely enumerating a large
number of the nearest A’s that come to hand, and this is the traditional
method of eliminative induction. This should not, however, be taken to
imply that laws do not of their nature have universal and where appropriate
infinite reference. It is not the potential reference of true laws to in¬
finite domains that is here in question, but the possibility of knowing, even
of a single instance, of what true law it is an instance. Neither is this point
the same as Jonathan Bennett has made in relation to a Humean versus a
Kantian account of laws.9 He has suggested that the difference is es¬
sentially that a Humean will not, and a Kantian will, reject the idea of a
natural law in an infinite domain with isolated exceptions. The point being
made here is indifferent to such Humean or Kantian accounts because it
implies that on both these views infinite generalizations will have zero con¬
firmation but that on either of them generalizations in finite domains may
have finite confirmation. Nothing is here implied about what may really be
the case about laws for someone who knows what the true laws are; what
is implied is concerned only with what can reasonably be believed about
laws by us, whose knowledge is limited. For the same reason, it is not an
objection to the present account that it implies that the confirmation of
certain existential statements is unity. In an infinite universe, if the zero
confirmation of the generalization (U) corresponds to our intuitions, then
the negative existential “There is an A which is not B” has the highest
possible confirmation. But it surely cannot be held that such an empirical
existential statement is certain? This, however, is not the correct interpreta¬
tion of a c-value of unity. Confirmation values are concerned with our
beliefs, not directly with what actually is the case in the world; thus a
c-value of unity implies only that we have the best of all possible reasons
for believing that there is, somewhere in the infinite universe, an A which
is not a B. The arguments for this belief are the same as those for belief in
the falsity of (U).
A possible objection to this account has been made by Kneale.10 He
points out that the fact that we generally prefer when necessary to modify
our description of a single instance rather than abandon the search for
general laws does indicate that such laws play a specially important role
in scientific inference. That is to say, if we find an A that is not B, we may
prefer an unrefuted general law “All (A and C) are B” to the unrefuted
statistical generalization “p% of A’s are B.” This point, however, is not
MARY HESSE 81

directly relevant to c-theory, which in itself cannot dictate or explicate the


preference because, as we have seen, if the general law has confirmation
zero, so has the statistical generalization. If c-theory is to be relevant, there¬
fore, we must demand that both these generalizations are asserted only
in a finite domain and that then the confirmation of the first is greater
than that of the second. Whether this is satisfied or not will depend on the
nature of the total evidence and the assignment of prior probabilities in the
c-theory adopted. Nothing that has so far been said entails that this con¬
dition cannot be satisfied, hence the preference for general laws as op¬
posed to statistical generalizations explicated within a c-theory in finite
domains. Incidentally, it has sometimes been held that if the confirmation
of such a finite law on the basis of reasonable-sized samples is nonzero
but implausibly low, the situation is as bad as for zero confirmation of
infinite generalizations. This is not so, however, because as long as the
confirmation is not vanishingly small, it is a merely technical problem (which
may of course be a difficult one) to devise a c-theory which makes its
value plausible, bearing in mind that we are not restricted to the simple
c-theories of Carnap’s first type, in which indeed such confirmation was
unacceptably low.
It will be noticed that the acceptability of this argument is intimately
related to a view of theoretical science similar to that of Popper, namely
that no universal generalization in an infinite domain can be shown to be
true or even probable. If it should seem strange that one who has totally
repudiated the possibility of c-theory should be brought in as a principal
witness in its defense, it should be pointed out that Popper wishes to
restrict science to universal generalizations in infinite domains. We may
agree with him that such generalizations have zero confirmation and also
that we are never in a position to say anything about their truth or even
probable truth. But we need not follow him in excluding from the aims of
theoretical science the confirmation of all lesser generalizations in finite
domains. It is to these that, we claim, confirmation theory is applicable.
It remains to show, against Popper, that such applicability is sufficient to
provide a logic of inductive inference.
2.3. It may be suggested at this point that our program has become al¬
together too ambitious. Granted that we never know how far our statements
of law and theory, and even the language in which they are expressed, are
going to be adequate in future science, much less that they are going to re¬
main acceptable for all time, are we not in any case restricted in adopting
a confirmation theory to a particular basic language of extralogical predi¬
cates within which that c-theory must necessarily be applied? It is surely
reasonable to believe that as long as science uses a particular basic lan¬
guage, some universal generalizations will remain acceptable, and that
these generalizations will only be overthrown or modified if new primitive
predicates are introduced or old ones discarded. For example, the generali-
82 1. SCIENCE AND INQUIRY

zation (x)(Ax D Bx) may not be eternal, but if A and B are primitive
predicates of the language Li in which it is made, whereas C or B' do not
occur in that language, it is somewhat strange to hold that (x) (Ax D Bx)
is false in Lx simply because another generalization (x)(Ax.Cx D. Bx) or
(x) (Ax D .Bx v B'x) replaces it in a richer language L2. This objection,
however, does not have much force, for if this is indeed the situation, then
(x) (Ax D Bx) is false in Lx and in every other language containing the
predicates A and B. It is false because there are A’s which are not B, either
because there are A’s which are not C or because there are A’s which are B'
and not B. This does not of course mean that nothing true can be said
about A and B in Lx, only that no universal generalization (U) can be truly
asserted because (U) is in fact not true. Furthermore, it is perfectly pos¬
sible for a single c-theory to bridge the gap between Lx and L2. Most c-
theories satisfy the so-called “fitting-together” criterion for predicates,
namely, that the value of c(h,e) is to be independent of all predicates except
those contained in h or e. Thus the complete list of primitive predicates does
not have to be known before c(h,e) is evaluated, and the basic language
can in this respect be extended indefinitely without disturbing c-values
already obtained in proper parts of itself.
Another objection related to the linguistic basis of a c-theory has been
made by Nagel in his comments on Carnap’s theory of induction in the
Schilpp volume.11 Nagel remarks that universal laws may be indispensible
even in recognizing the evidence for a law. For example, confirmation of
the law “All swans are white,” even for a finite domain, requires acceptance
of some statement “a is a swan,” and this in turn presupposes universal
laws implicit in the universal term “swan.” Similar points have been made
by Popper,12 who would seem to hold that no evidential statement of this
kind is in principle a singular statement nor can be known conclusively to
be true, because the observation-report “a is a swan” can always be chal¬
lenged and further tested and such tests may result in withdrawal of the
statement—the light was bad; being a logician I was obsessed with the
concept of “swan”; what I saw flying across the lake was actually a goose.
If there is in principle no end to this process of challenge and test, then
indeed a statement such as “a is a swan” is not singular but has con¬
sequences which are properly tested by events other than the observation
which prompted its utterance. It is not clear, however, that these con¬
sequences are such that “a is a swan” must be taken to be an elliptic ex¬
pression of one or more universal generalizations in infinite domains. Indeed
on Popper’s view at least it would be highly paradoxical if this were so,
because if we can never know such a universal generalization to be true
or even probable, and if every statement of evidence both of positive and
negative instances of a generalization itself presupposes such knowledge, it
is difficult to see how any generalization can even be known to be false.
MARY HESSE 83

Why, except for entirely pragmatic reasons, should we accept, even pro¬
visionally, one general statement rather than any other?
This problem is not so pressing on Nagel’s view, for presumbly Nagel
does believe that we can know or be reasonably confident in the truth of a
statement such as “a is a swan.” In that case it may be suggested that he is
over ambitious in requiring c-theory to explicate the conditions of ap¬
plicability of its own basic language. To adopt a language containing cer¬
tain extralogical predicates in terms of which to explicate the mutual induc¬
tive relations of its statements is indeed to presuppose that the predicates
are universals, and we cannot give finite sets of necessary and sufficient
conditions for the applicability of universal terms. But it does not follow
that the process of applying universal terms is in the least like that of coming
to accept universal laws, nor that it depends on probabilities in infinite
domains. Indeed the difficulties involved in any such construal of universals
may well suggest that here as well as in our account of induction we ought
to eschew infinities altogether. At worst, Nagel’s objection means that
c-theory does not give an account of the logic of the learning of its own
basic language, but it is not at all clear why it should be expected to do so.
To require this would be something like requiring a formalization of
arithmetic to define uniquely the natural numbers, a task which, though
originally worth attempting, we now know to be impossible, but we
do not thereby regard the formalization of arithmetic as an impossible
or worthless enterprise.
Nagel has a related objection, based on the supposition that experimental
tests always themselves depend on acceptance of the universal laws govern¬
ing the test apparatus and surrounding test conditions. But here, surely,
infinite domains need not be involved, only high probability over finite
domains which are defined to be as large as is necessary. Generalizations
concerned with instruments do not even need to be universally true—the
laws of classical optics, for example, may provide a sufficient approximation
to the behavior of a telescope even when it is testing predictions derived
from a theory which contradicts classical optics. It follows that any pre¬
supposition involved in accepting the evidence as highly probable can al¬
ways be stated without loss in terms of high confirmation of generalizations
in finite domains.
Another objection can be dealt with equally briefly. This is that if all
infinite generalizations have zero confirmation, then a generalization which
has not yet been refuted is no better off than one which is known to be
false. And yet we must consider the status of the former as different from
the latter. My reply to this would be that we do consider it as different, but
only in virtue of the confirmation of the generalizations over limited domains
which are its consequences. That is to say, what we actually use in theoreti¬
cal inference are statements of the form “All the next n A’s in a limited
84 1. SCIENCE AND INQUIRY

region of space and time are B,” where n and the limited region may be
as large as we like to specify. Such statements have nonzero confirmation in
any well-behaved c-theory. The value of that confirmation will however be
very different according as the evidence does or does not contain instances
of A which are not B. In other words, our confidence in predicting of the
next million A’s that they will be B will be much diminished if the strictly
universal generalization “All A’s are B” has already been refuted.

3. THE NECESSITY OF LAWS

3.1. So far, we have considered only objections to the zero confirmation


of laws arising from what might be called a Humean account. There is,
however, an influential construal of laws in which it is claimed that they
are asserted not only for a potentially infinite number of actual instances
but also for a potentially infinite number of “conditional” or “counter-
factual” instances. That is, not only are all (perhaps a finite number) of
actual ravens asserted to be black, but it is also asserted that if anything
(out of an infinite number of things) were a raven (and some are and some
are not), then it would be black. There is much dispute over the question of
whether such counterfactual inferences are in fact required in science. I
am one of those who believe that they are,13 thus making the confirmation
problem as difficult for myself as possible. However, I think it can be argued
that this view does not after all make the problem more difficult than it is
already, because once the controversial introduction of counterfactual
inferences has been performed, no new problems arise about their con¬
firmation.
The difficulty can be put like this. If counterfactual inferences from laws
are required, then we require finite confirmation, not only of singular as¬
sertions of the form “This raven is black,” but also of “If this thing were
a raven (though it is not), it would be black.” Leaving aside the question
of how we would supplement the logic of material implication to allow this
latter assertion to be nonvacuous, it would be well confirmed if it was
entailed by the universal law “All ravens are black,” which was itself well
confirmed. Those who assert that such counterfactual inferences are required
in science also assert that they are entailed by universal laws, so this part
of the requirement is met. But of course the second part of the requirement
cannot be met because the universal law cannot have confirmation other
than zero in an infinite domain. Does the fact that the entailment holds
imply that the law has an infinite domain? It is not clear that it does. On the
view being defended here, the assertion which does have finite confirmation
is of the form “All ravens in the next million observed in this fairly large
space-time region are black” (call this (W)). Given that the reason this is
confirmed is not that it is not lawlike but that it does not presuppose that
MARY HESSE 85

we know the whole truth about ravens everywhere and everywhen, it


seems reasonable to maintain that if (and no doubt only if) “All ravens
are black” entails “If this were a raven this would be black,” then (W)
entails “If this object in this space-time region were a raven it would be
black.” An assertion in a finite domain which is said to be lawlike in virtue
of entailing counterfactual consequences does not thereby acquire an infinite
domain of application, or at least nothing that has so far been said in this
controversy suggests that it does.
3.2. There is, however, a deeper objection to the whole idea of explicat¬
ing confirmation in terms of probability, which has been expressed most
forcefully by Kneale in Probability and Induction. Kneale’s interpretation of
scientific laws is that they are principles of physical necessitation, which he
distinguishes on the one hand from logical necessity and on the other from
universal co-occurrence, or accidental generality. This notion of physical
necessity is itself far from clear, but at least the consequences which Kneale
claims that it has for a probabilistic theory of confirmation can perhaps be
evaluated without raising the much more difficult issue of necessity itself.
It should be noticed first that Kneale himself does not claim that the issue
of necessity versus accidental generality is co-extensive with the difference
between strictly universal and statistical generalizations. The latter may also
be necessary or they may be accidental. They may be necessary either be¬
cause they are consequences of necessary and strictly universal laws (as
in classical statistical mechanics), or because they are concerned with ob¬
jective and irreducible probabilistic propensities (as we believe to be the
case with the fundamental particles). Thus this notion of necessity must not
be confused with the quite different notion of strictly universal determinism,
and whenever Kneale uses the expression “A-ness necessitates B-ness” he
is to be understood as subsuming expressions of the form “A-ness neces¬
sitates B-ness in p% of the cases, and not-B-ness in the rest.”
Kneale’s argument is briefly as follows.14 If the generalization “All A’s
are B” is a law of nature, it is properly construed as “A-ness necessitates
B-ness.” However in a probabilistic theory of confirmation in which the
probability of a law on given evidence is taken to be some function of the
number of worlds in which law and evidence are both true, and of the total
number of worlds in which the evidence is true, we have to contemplate
“All A’s are B” as true in some one out of a number of possible worlds.
But if A-ness necessitates B-ness, there cannot possibly be such other worlds
in which it is not true. In the same way the truth of a mathematical theorem
cannot be just one of the possibilities, because either it is true, and the
alternatives are impossible, or it is false, and then not possibly true.
This argument undoubtedly has some initial appeal, because the con¬
ception of physical necessity sits uneasily with that of random sampling on
which probabilistic methods are based in statistics. But if we draw a distinc¬
tion between the necessity of a law as it is in nature and the apparent
86 1. SCIENCE AND INQUIRY

randomness of the ways in which we may come to know it, the argument
can, I think, be shown to depend on a pun on the word “possible.” The
situation to which c-theory is relevant is that of contemplation of a
potential law “All A’s are B” when we do not know on the basis of the
evidence available whether this generalization is true or not. If it is a law,
then on Kneale’s view A necessitates B, and then in one sense it is im¬
possible that A should be not-B, but it does not follow, since by definition
we do not know that “A necessitates B” is a law, that it is impossible that
“A necessitates B” should be false, and hence that there are other pos¬
sible lawlike statements, some of which may be true. And Kneale has not
noticed that if his objection were conclusive it would also count against
the use of probability even for accidental universal generalizations, for if
it is a cosmic fact that all A’s are as a matter of fact B’s, it is likewise not
“possible” that an A should not be a B. But this rests more obviously on
a pun on “possible,” and Kneale is not deceived by it, for he accepts the
ordinary scientific use of probability for chance occurrences. The case of
logical necessity to which he appeals is not conclusive for his argument
either, for when we do not know whether a potential mathematical theorem
is true or false (as with Goldbach’s conjecture, for example), there
is one sense of “possible” in which it is possibly true and possibly false.
But in this case, since mathematical theorems, unlike scientific laws, are
matters of proof, it is not likely that our degree of belief in Goldbach’s
conjecture is happily explicated by probability-functions. Whether confirma¬
tion of laws is so explicable is a question of whether our degree of reason¬
able belief in them behaves like a probability-function in the light of con¬
firming evidence and cannot be decided a priori by arguments about physi¬
cal necessity, however that may be understood.
3.3. There is implicit in Kneale’s discussion another serious question
about laws vis-a-vis accidental generalizations. Since c-theory depends
merely on the syntactic relation between evidence-statements and law-state¬
ments, it cannot distinguish laws from accidental generalizations, and in
particular it cannot reflect our greater willingness to allow confirmation to
a potential law than to an accidental generalization, even on similar evi¬
dence. For example, considered merely as a relation between evidence and
generalization, a sample of viruses all of which have caused Asian flu in
persons whose birthdays lie in June is no better evidence for the generaliza¬
tion “All these viruses cause Asian flu” than it is for “All these viruses
attack only people whose birthdays are in June.” And yet we should be
much more inclined to regard the former as well confirmed than the latter.
The first remark to be made about this difficulty is that it is not reason¬
able to expect c-theory alone to decide for us how we are to distinguish
laws from accidental generalizations. Since it is almost universally admitted
that the necessity of laws does not reveal itself by peering more closely at
the events which may be their instances, it will not be revealed either by a
MARY HESSE 87

c-theory which can by definition only take account of the statement of the
law and the statement of all the evidence that is available for it. Neither
is the problem removed by building into a c-theory a logical connective
equivalent to “necessitates” as well as the usual connective of material
implication (as has been suggested for other reasons by L. J. Cohen). For
in such a theory the problem would be that unless prior distinctions were
made, all accidental generalizations not yet contradicted by the evidence
would get as high confirmation as potential laws. Instead of all laws look¬
ing like accidental generalizations, as in probabilistic theories, all ac¬
cidental generalizations would look like laws. The question how to dis¬
tinguish potential laws from accidental generalizations must be decided
independently of c-theory, but once it has been decided in a form which
can be expressed in terms of c-theory, we shall of course be entitled to
demand that the c-theory give us relative confirmation values in accordance
with our usual assumptions about the relative confirmability of the two
kinds of statement.

4. ANALOGICAL INFERENCE TO LAWS

One widely accepted way of making this distinction is the view15 that
the generalizations we are prepared to accept as potential laws are those
that follow from the conjunction of appropriate boundary conditions with
more or less systematic theories, which are in turn supported by a variety
of other laws. The generalizations that we regard as accidental, on the other
hand, are those which are not so connected with other generalizations by
theories. This view demands of a c-theory that when laws are consequences
of a theory they should be given confirmation higher than the confirmation
they receive from their instances alone, for they share the latter possibility
of confirmation with accidental generalizations, but not the former.
The view is, however, open to the general objection that some potential
laws can be distinguished from accidental generalizations even when no
systematic theory is present; for example we may be quite ignorant of any
explanatory theory of how the virus causes Asian flu and yet be prepared
to accept this correlation as lawlike. This is one reason why it is not satis¬
factory to adopt this view as a basis for the confirmation of laws. There is
also a more fundamental reason why it in fact cannot be accepted as it
stands as defining the conditions of adequacy of a c-theory. It is easy to
show that, in any probabilistic c-theory, the confirmation of a law L cannot
be increased above its value (c;) on the evidence of its own instances,
merely in virtue of its deducibility from a theory T, unless it is also in¬
creased by all the evidence for the theory even before the theory has been
explicitly formulated. That is to say, supposing the whole of the evidence
for T is E, then since c(L,E) is a single-valued function of L and E only,
88 1. SCIENCE AND INQUIRY

its value is independent of any theory T which may be or may have been
proposed to explain E in the deducibility sense of explanation. This result
means in general that if probabilistic c-theory is to explicate the relation
between explanatory theories and laws, theories must satisfy stronger con¬
ditions than those required on the deducibility account because this account
is not sufficient to show that theories are relevant to the confirmation of
laws. I have suggested elsewhere16 that the further conditions to be imposed
on T must be derived from relations of analogy between the objects specified
in the total evidence E, so that T is not so much one among many possible
postulate systems having E among its consequences as the explicit state¬
ment of the analogies which exist among a number of empirical systems
including E, the best theory being the one which embodies the strongest
and most extensive analogy. A theory would therefore be established when
the number of analogous systems realizing it is large in number, and this
would incidentally have the effect that, given appropriate differentiating
boundary conditions, the number of its empirical consequences is also
large.
On such a view it is at once possible to understand why we accept the
correlation of viruses with Asian flu as lawlike and the correlation of viruses
with June birthdays as accidental. The former correlation does, and the
latter does not, have analogies with many other empirically established
correlations, and this can be known even before more sophisticated theories
of the disease are available. In order to ensure that the confirmation
c(L,E) is greater than C;, it is necessary only to impose upon the c-theory
the requirement that evidence comprising data about a large number of
analogous objects or systems of objects yields in general higher c-value for
L than evidence comprising only isolated objects or systems which are
instances of L.
The view that at least some types of explanatory theory are best regarded
as explicit statements of analogies among instances is a very natural ex¬
tension of the view of laws which has been maintained in this paper, for
exactly the same considerations apply to the deducibility of instances from
general laws as to the deducibility of laws from theories. No probabilistic
c-theory will give confirmation to a particular prediction, for example “The
next raven will be black” (I) on the basis of observed instances of black
ravens (E), which is greater than c(I,E), merely in virtue of confirmation
of the generalization “All ravens are black” from which (together with
boundary conditions) both I and E are deducible. It may or may not be the
case that universal generalizations in infinite domains can be given finite
confirmation, but confirmation of instances is entirely independent of this
possibility, and the result just stated holds in either case. Thus the con¬
firmation of the generalization may be zero without entailing that c(I,E)
is zero.
MARY HESSE 89

5. CONCLUSIONS

The arguments of this paper have not depended upon any particular
assignment of prior confirmation to theoretical statements, but have rather
been intended to elicit some of the criteria of adequacy which such an
assignment should satisfy. It has been argued on the one hand that finite
prior confirmation in an infinite domain is not one of these criteria, but on
the other hand suggestions have emerged regarding three criteria which
are required and which may be summarized as follows:
(A) Variety of instances in the evidence should tend to confirm a gen¬
eralization more highly than the same number of instances all of the
same kind.
(B) In inductions from instance to instance or from finite law to finite law,
confirmation should increase with the degree of analogy between in¬
stances. Apart from its intuitive desirability, this requirement offers
the best hope of explicating within confirmation theory those induc¬
tions which are normally said to depend upon explanatory theories.
(C) Although the confirmation of a law in infinite domains remains zero,
it should be possible to show that its confirmation in specified finite
domains, however large, does reach a value reasonably near unity with
increasing amounts of favorable evidence.

There is at present no c-theory which satisfies all of (A)—(C). The


theory of c-functions is in fact still in a comparatively primitive state, and
there is as yet, for example, no c-theory for a language of N individuals
capable of dealing even with dyadic relations .17It can, however, already be
shown, for a domain of not more than four individuals and a finite number
of nonmetric monadic and dyadic predicates, that Carnap’s -theory, with
77

some obvious extensions, does satisfy the conditions specified in (A) and
(B).18
Apart from the as yet unsolved question of an adequate c-theory, how¬
ever, a certain attitude toward laws and theories has emerged from this
discussion which may have some independent interest. In conclusion let
me summarize the main characteristics of this attitude.
1. It has been argued that it is unreasonable to believe that any universal
generalization in an infinite domain, as stated at a given stage of science, is
true.
2. Inductive inference (and perhaps also use of universal terms) can be
analyzed without loss into inference from particulars to particulars, or at
most to generalizations in finite, though large, domains.
3. This view does not detract in any way from the special features that
90 1. SCIENCE AND INQUIRY

have been ascribed to general laws, specifically not in the following respects:
(a) It does not imply that statistical generalizations should be preferred to
universal laws.
(b) It does not imply that c-theory cannot assign finite confirmation to
counterfactual inferences.
(c) It does not imply that laws are not in some sense necessary.
(d) It does not imply that laws cannot be distinguished from accidental
generalizations.
(e) It does not empty the notion of law in an infinite domain of all signifi¬
cance. It merely denies that we can ever have any reasonable grounds
for believing such a law to be true.
4. Neither does this view devalue theories. It rather reinterprets theories
as expressions of the analogies between their instances, in virtue of which
analogical inferences can be made to further finite sets of instances.

NOTES
1. Principles of the Theory of Probability, International Encyclopedia of Unified
Science, Vol. I, No. 6, Chicago, 1939.
2. The Philosophy of Rudolf Carnap, ed. P. A. Schilpp, La Salle, Ill., 1963,
pp. 785-825.
3. Part of this paper was read at a colloquium at Oberlin College, April 15-17,
1966. I am much indebted to the commentators on this occasion, Professors
P. Achinstein and P. Alexander, and to others who took part in the discussion. I
wish especially to acknowledge a correction to my original formulation of the be¬
ginning of the second paragraph, Section 2.2, which was made by Professor Achin¬
stein, and has been incorporated in the present version.
4. R. Carnap: Logical Foundations of Probability, Chicago, 1950; The Con¬
tinuum of Inductive Methods, Chicago, 1952; Induktive Logik und Wahrscheinlichkeil
(with W. Stegmiiller), Vienna, 1959.
J. Hintikka: “Towards a Theory of Inductive Generalization,” Logic, Methodology
and Philosophy of Science, ed. Y. Bar-Hillel, Amsterdam, 1965, p. 274; “On a
Combined System of Inductive Logic,” Acta Philosophica Fennica, 18, 1965, 21;
“Induction by Enumeration and Induction by Elimination,” The Problem of Inductive
Logic, ed. I. Lakatos, Amsterdam, 1968, p. 191, and my comment on the latter
article, ibid., p. 220.
P. Achinstein: “Variety and Analogy in Confirmation Theory,” Phil. Sci. 30, 1963,
p. 216; and Carnap’s reply, ibid. p. 225.
Mary Hesse: “Analogy and Confirmation Theory,” Phil. Sci. 31, 1964, p. 319.
5. M. Bunge: The Myth of Simplicity, Englewood Cliffs, N.J., 1963, p. 116ff.
L. J. Cohen: “What has Confirmation to do with Probabilities?,” Mind, 75, 1966,
p. 463; “A Logic for Evidential Support,” Brit. Journ. Phil. Sci., 17, 1966, pp. 21
and 105.
W. C. Kneale: Probability and Induction, Oxford, 1949.
H. Kyburg: Probability and the Logic of Rational Belief, Middletown, Conn.,
1961.
K. R. Popper: Logic of Scientific Discovery, London, 1959, Appendices *vii-*ix.
6. The Philosophy of Rudolf Carnap, p. 977.
7. “‘Content’ and ‘Degree of Confirmation,’” Brit. Journ. Phil. Sci., 6, 1955, p.
160.
8. “On the Confirmation of Laws,” Phil. Sci., 26, 1959, p. 14.
9. “The Status of Determinism,” Brit. Journ. Phil. Sci., 14, 1963, p. 106.
10. Probability and Induction, p. 73.
11. Op. cit., p. 802.
12. See for example Logic of Scientific Discovery, p. 94.
MARY HESSE 91

13. Cf. P. Alexander and M. Hesse: Symposium on “Subjunctive Conditionals,”


Aris. Soc. Supp. Vol. 36, 1962, p. 185.
14. Op. cit., p. 211 ff.
15. See for example R. B. Braithwaite: Scientific Explanation, Ch. 9; E. Nagel:
The Structure of Science, Ch. 5.
16. “Consilience of Inductions” The Problem of Inductive Logic, p. 232.
17. See Logical Foundations of Probability, p. 124.
18. Induktive Logik und Wahrscheinlichkeit, Appendix B. Some of the proofs are
given in my “Analogy and Confirmation Theory,” Phil. Sci. 31, 1964, p. 319; others
are as yet unpublished.
INDUCTION AND THE AIMS OF INQUIRY'
Isaac Levi

“The conclusions of science are,” according to Ernest Nagel, “the fruits of


an institutionalized system of inquiry.”18 This implies, as Nagel has often
pointed out, that the products of scientific inquiry are evaluated according
to certain standards and that conformity to such standards is a means for
promoting the aims of inquiry.2
Nagel’s writings have greatly enhanced our understanding both of the
aims of inquiry and of the standards for legitimate inference. His con¬
tribution is to be found not only in the considerable insights he has offered
into the structure of scientific explanation but also in the interesting ques¬
tions which he has raised for further study.
In recent years, appeals have been made to current theories of statistical
inference to show that scientists must make value judgements in reaching
conclusions.3 Nagel comments on the import of such arguments for the
social sciences as follows:
However, the theoretical analysis upon which this thesis rests does not entail
the conclusion that the rules actually employed in every social inquiry for
assessing evidence necessarily involve some special commitments, i.e., com¬
mitments such as those mentioned in the above example, as distinct from those
generally implicit in science as an enterprise aiming to achieve reliable knowl¬
edge.4

Nagel also observes that nothing in the reasoning of theoretical statistics


depends upon whether the subject matter under consideration belongs to
the natural or to the social sciences. Hence, were it the case that such in¬
ference required the introduction of special moral, political, economic or
prudential value judgements, this would be true both for the natural and
the social sciences.5 In the passage just cited, Nagel does not deny that
value commitments are implicit in science or that such commitments con¬
trol the legitimacy of scientific inferences. What he doubts is that these
commitments are those of self-interest, morals, politics, or economics.
Science as “an institutionalized system of inquiry” has its own aims and
92
ISAAC LEVI 93

its own values, and it is these values which control the legitimacy of in¬
ferences.
Nagel’s remarks suggest a question for study of considerable philosophi¬
cal interest: Can an account of standards of legitimate inference be con¬
structed which shows how conformity to these standards promotes the aims
of inquiry?6 This question can be raised about statistical inference in par¬
ticular and about scientific inference in general.
To construct an account of scientific inference of the sort required, some
conception of the relevant features of legitimate inference must be available.
In addition, the aims of inquiry which control the legitimacy of scientific
inferences must be identified. Finally, a theory of rational goal attainment—
i.e., a theory of rational decision-making—must be introduced in terms of
which it can be shown that conforming to standards of legitimate inference
best promotes the aims of science.
All of the topics just mentioned are surrounded by controversy. Con¬
sequently, each step in a decision-theoretic account of scientific inference
requires a rather elaborate defense; and even with the aid of such argu¬
ments, the proposals thus supported could only pretend to be tentative.
Space will not allow for extensive discussion of all the controversial steps to
be taken in the following discussion. Therefore, the suggestions made here
should be taken only as a basis for further discussion and not as in any
sense final.
Thus, in spite of the well known difficulties involved in using the injunc¬
tion to maximize expected utilities as the basis for a decision theory, this
principle will be adopted here. The difficulties are accompanied by certain
advantages. First, this procedure allows something to be said about the
way in which scientific inferences are related to probabilities.7 In addition,
utility theory can be used as an analytic device for representing the aims
of science which control the legitimacy of inferences. These utilities, of
course, are not to be understood as representing satisfactions or subjective
preferences of individual investigators. Rather they are to be interpreted
as indices of the relative importance which scientists are committed to at¬
taching to the consequences of their inferences insofar as they are com¬
mitted to promoting the institutional aims of science.
In order to apply the injunction to maximize expected utility, some¬
thing must be done about indentifying the “cognitive decision problem”
facing an investigator when he uses given evidence to reach a conclusion.
This involves specifying the “options” open to him, the relevant “out¬
comes” of these options, the scientific objectives controlling his “decision”,
and the representation of these objectives with the aid of an “epistemic”
utility function.8 The subsequent discussion is devoted to an examination of
these problems.
94 1. SCIENCE AND INQUIRY

II

Carl Hempel, who was the first to publish a cognitive decision model of
the sort to be considered here,9 has, with his customary clarity, isolated
the problem with which he is concerned as follows:
. . . let us suppose now that a scientist has at his disposal the set of all sentences
accepted at the time, which we may assume to be expressed in the form of one
complicated sentence e; that he has invented, or has been presented with a
set of n hypotheses, hlf h2 . . . , hn, which, on e, are pairwise incompatible
while jointly exhausting all possibilities . . . ; and that he has to choose one from
the following n-1 courses of action: To accept hi and add it to e; . . . ; to accept
hn and add it to e; to accept none of the n hypotheses and thus leave e un¬
changed. The problem is to construct a rule that will determine which choice it
is rational to make.10
Hempel disclaims any effort to devise a procedure for generating the
list of sentences hi, h2, . . . , hn.
. . . the construction of such hypotheses requires, in general, scientific in¬
ventiveness and, in important cases, great genius; it cannot be achieved by the
mechanical use of induction rules. The inductive problem here considered is
rather that of deciding on the available evidence—which may include the
results of extensive tests—, which, if any, of the proposed hypotheses is to be
“accepted” and thus to be added to the corpus of scientific knowledge.11

Hempel then suggests that, as a first approximation, the proverbial


“pursuit of truth” in science may be construed as “aimed at the establish¬
ment of a maximal system of true statements.”12 This is the goal which,
in Hempel’s view, controls the decision as to which, if any, of the pro¬
posed hypotheses is to be “added to the corpus of scientific knowledge.”
These remarks leave the impression that an investigator who reaches a
conclusion does so by selecting one from a list of hypotheses which has
either been presented to him or has been invented by him. Although
Hempel is right in doubting the availability of any prescription for gener¬
ating hypotheses, this does not mean that such a list is immune from
criticism in its own right. An investigator will not even consider choosing
between members of a set of hypotheses unless these hypotheses provide
relevant answers to the question he is investigating. Hypotheses about
the genetic code simply are not relevant to someone concerned to measure
the Doppler effect exhibited by a quasar. Thus, an inspired genius might
provide a list of possible answers; but unless they are answers to a ques¬
tion, his genius has been wasted on the parlor game: “I’ve got an answer.
Now give me a question.”
But if the adequacy of a list of answers is controlled by the scientific aim
of identifying the relevant possible answers to a given question, it is also
true that the problems chosen for scientific study are themselves evaluated
ISAAC LEVI 95

in the light of other scientific objectives. Again, there is no prescription


for proposing important scientific problems; but some problems are more
important from a scientific point of view than others. Thus, if the quest
for explanations is, as Nagel and Hempel have both maintained, one of
the distinctive aims of science, one way in which it affects inquiry is in the
choice of problems for study. A problem involving a choice between theories
is of more importance relative to this aim than is a question about the
detailed behavior of a specific pendulum.
Observe, however, that it does not follow that the quest for explanations
is the immediate goal controlling the selection of the list of relevant answers
to a given question or, if it is a controlling factor, that its relevance is of
the same sort as it is when appraising the importance of alternative prob¬
lems. By the same token, the quest for explanations need not be the
immediate goal to be realized when an investigator who has already chosen
a problem and has found an adequate list of possible relevant answers is
about to decide which of these relevant answers is the correct one relative
to his evidence. Choosing problems, supplying possible answers to them,
and choosing between these possible answers (i.e., making inferences)
are distinct types of scientific activity. These activities are interrelated
and, in the flux of real life scientific inquiry, are often difficult to separate.
But, for purposes of analysis, they should be distinguished and the goals
which control them should not be confused.
These points suggest two emendations of Hempel’s description of the
cognitive decision problem which an investigator is attempting to solve
when he makes inferences. The first is that he is not to be viewed as choos¬
ing between members of any list of alternative hypotheses but rather as
choosing between alternative possible answers to a given question. What
this means is that the legitimacy of any inference he makes depends not
only on the evidence available but also on the question under considera¬
tion and the set of relevant answers to that question.
The second emendation concerns his conception of the relevant “cog¬
nitive objective” as being to establish maximal systems of true statements.
Although this goal may be a factor controlling the choice of problems
for study, it surely cannot be the immediate aim of an effort to provide
an answer to a given question relative to given evidence. If this were so,
the alternative hypotheses from which investigators would choose would
have to be the strongest and most complete possible descriptions of the
world which they could provide compatible with the constraint of con¬
sistency and the limitations of imagination.
Strictly speaking, the goals involved in picking between alternative
hypotheses are as many as the problems raised. An investigator studying
quasars has different “cognitive aims” than a student of the genetic code.
The best that can be done here is to identify certain features which the
goals embodied in broad classes of scientific problems share in common.
96 1. SCIENCE AND INQUIRY

Hempel’s provisional characterization of the “proverbial pursuit of truth


can be understood as suggesting two such features. The first is that the
answer chosen be true. The second is that the answer chosen be as maxi¬
mal” as the question demands. This desideratum requires further elabora¬
tion. But, even as stated here, it points to one place where the question
raised and the list of relevant answers to it become important to the legiti¬
macy of the conclusion drawn from the evidence.
There is still another type of scientific activity which should be dis¬
tinguished from drawing conclusions from given evidence relative to a
specific question and a list of relevant answers to that question. It is one
thing to reach a conclusion relative to given evidence e and quite another to
decide that the evidence e is so adequate that conducting further tests or
looking for new evidence is pointless. On some occasions, such activity is,
indeed, pointless. There is little to be gained, from a scientific point of
view, in testing the claim that Newton’s mechanics is adequate for middle-
sized objects at moderate velocities and within moderate limits of ac¬
curacy.13 In other cases, however, the available evidence may warrant
accepting a given conclusion as the correct answer to a given question but
further inquiry may still be of considerable scientific importance. A dis¬
tinction must, therefore, be made between the aims controlling the choice
between relevant possible answers to a given question relative to fixed evi¬
dence and those controlling the decision as to whether inquiry ought to be
continued.
This distinction is overlooked by Hempel. Hempel’s problem is to de¬
cide “which, if any, of the proposed hypotheses is to be ‘accepted’ and thus
to be added to the corpus of scientific knowledge.” Hempel seems to be
saying that when some sentence h is accepted, it can legitimately be added
to the total evidence for purposes of making further inferences. But if the
sentence h is accepted only tentatively in the sense that further inquiry
is called for, adding h to the evidence would result in a form of question
begging.
Thus, Hempel’s “inductive problem” ought to be broken up (at least
for purposes of analysis) into at least two parts: the problem of deciding
which of a set of relevant answers to a given question to pick as the correct
one to that question relative to fixed evidence, and the problem of deciding
whether the available evidence is sufficient for terminating the investigation
or whether more evidence is needed. The subsequent discussion is con¬
cerned primarily with the first of these two problems.

Ill

Suppose that an investigator attempting to answer a given question Q


has succeeded in identifying a suitable list of relevant answers to his ques-
ISAAC LEVI 97

tion and has available to him total evidence e. The first problem to be faced
in setting up a cognitive decision model for choosing among the relevant
possible answers is to find a standardized way of representing the pos¬
sible answers.
Let U—to be called “the ultimate partition”—be a set of n sentences
hi, h2, . . . , hn such that e implies that at least and at most one of the hi’s
is true and such that each of the hi’s is consistent with e. Each element of
U is a “strongest consistent relevant answer to the question Q” in the
sense that, given e, no other relevant answer entails an element of H
(unless it is equivalent to it, given e). If the elements of U are alpha¬
betized, all relevant answers to the question will be equivalent, given e, to
one and only one sentence in the set M generated by U as follows: All
disjunctions of m distinct elements ofM(0<m^n)in which each dis¬
junct appears once and only once and in alphabetical order together with
the sentence C which is the conjunction of all elements of U in which each
conjunct appears once and only once and in alphabetical order. M has 2n
elements.14
Strictly speaking, an investigator does not answer a question by ac¬
cepting an element of M. He is committed to accepting a set of sentences
consisting of some element H in M together with all the deductive con¬
sequences of H and his evidence e. Thus, H is better characterized as the
strongest sentence accepted via induction from the evidence e relative to
the ultimate partition U.
This way of looking at the situation explains why two investigators
disagree where one accepts H and the second accepts HvG without accept¬
ing H. The sentences they accept are not inconsistent with one another; but
one accepts a different system of sentences than the other.
To accept C as strongest (where C is the conjunction of all elements in
U) is to contradict oneself. In so doing, one obtains as strong (as “maxi¬
mal”) an answer to the question as is possible but with the inevitable
consequence that some of the sentences accepted will be false. Hempel
does not mention the option of accepting C as strongest among his cog¬
nitive options. Given that any adequate account of scientific inference
should insure that this option is never taken, his omission is not serious.
However, it is useful to take explicit notice of it in order to show that it
will indeed never be taken.
In Hempel’s setup,15 the remaining options correspond (when modified
to meet the requirements introduced previously) to cases of accepting h4
as strongest where h, is an element of the ultimate partition U. There is
one exception. Hempel allows for suspension of judgement. This is repre¬
sented in the scheme proposed here by the case of accepting S as strongest
where S is the disjunction of all n elements of U. To accept S as strongest
is to accept only the deductive consequences of the total evidence e.
The trouble with Hempel’s scheme shows up where U has more than
98 1. SCIENCE AND INQUIRY

two elements. If there are three, hi,h2, and h3, then in addition to accepting
some hi as strongest, accepting C (hi&h2&h,3) as strongest or accepting S
(hivh2vh3) as strongest, accepting hivh2, hivh3 or h2vh3 as strongest are
options always open to the investigator and should be considered in any
account. Instead of the n + 1 options mentioned by Hempel, there are 2 .
Of course, when U has exactly 2 elements, the two schemes are identical.

IV

A “cognitive decision problem” of the sort considered here is one


where an investigator is to choose one of a list of sentences in a set M
generated by an ultimate partition U to accept as strongest via induction
from his total evidence e. U consists of the strongest consistent relevant
answers to his questions and the sentence H in M accepted as strongest
represents the conclusions inferred from the evidence.
The aim of the investigator in choosing an answer is to provide one
which is both true and as maximal as the question demands. Accepting
H as strongest yields a true answer if and only if, given the truth of the
evidence e, all of the sentences thereby accepted are true. This holds if
and only if H is true.
By a maximal answer, one is to understand an informative answer or
one bearing high content. The elements of M can be partially ordered with
respect to content in a manner about which all writers who have concerned
themselves with content would agree. H has more content than G relative
to the evidence e if and only if it entails G given e. Moreover, if a cardinal
measure of amount of content is attempted, agreement can be won as to
the formal properties of this measure.
Let m(H,e) be a normalized probability measure defined over M and
all sentences which are equivalent given e to elements of M.16 Corresponding
to m(H,e) is a content measure defined as follows:

cont(H,e) = m(- H,e)

Most writers who have used measures of content in connection with


scientific inference have made two additional assumptions which shall be
rejected here. The first is that measures of content are purely logical in
the sense that the content of h given e depends solely on the semantic or
syntactic properties of h and e. The second assumption to be rejected is
that the probability measure to be used to measure content is the same
one that is to be used to measure inductive probabilities, fair betting quo¬
tients, or whatever logical application measures of probability are in¬
tended to have in scientific inquiry. In particular, Hempel, Hintikka and
Pietarinen, and I, who are the only ones who have constructed cognitive
decision models before, have adopted these assumptions.17
ISAAC LEVI 99

But if it is true that the sense in which investigators demand information


is one which is controlled by the problem they are attempting to solve,
both of these assumptions should be abandoned. The information sought
is that and only that which is relevant to the question. Given, therefore,
that each element of the ultimate partition U yields as definite a consistent
answer as the question demands, each of them should be assigned equal
content.18 Observe that this is not a verbal quibble over the correct explica¬
tion of the notion of content. The issue concerns that notion of content
which is relevant to scientific inference. The claim being made here is that
the appropriate notion is one of content or information demanded by the
question under consideration. This notion does not seem susceptible of a
purely syntactic or semantic explication.
Let H, therefore, be any element of M and let the number of elements
of U incompatible with H given e be k. The content of H given e—
cont(H,e)—will be equal to k/n. Each element of U will receive content
equal to (n — l)/n. This seems to be a suitable measure of the relevant
information added to the evidence for the purpose of answering the ques¬
tion associated with U.
In the light of what has been said, accepting H as strongest via induction
from e relative to U has two outcomes: (a) correctly accepting H as strong¬
est and obtaining cont(H,e) of relevant information, and (b) erroneously
accepting H as strongest and obtaining cont(H,e) of relevant information.
Thus, if there are n elements in U, there will be 2n cognitive options and
2n +1 outcomes. The next step in constructing the model is to define an
appropriate epistemic utility function over the outcomes which will yield,
with the aid of a probability distribution over U, an expected epistemic
utility function defined over the options.

y
Efforts to obtain true and informative answers to questions are multi¬
purpose endeavors in the sense that the aim of the activity can be viewed
as some sort of compromise between distinct goals each of which could
be pursued to the total neglect of the others. Thus, an investigator might
attempt to obtain true answers to questions without regard to how informa¬
tive they are. Alternatively, he might look for informative answers to ques¬
tions without any concern for the truth values of the answers he obtains.
Thus, a natural approach to the problem of constructing an epistemic
utility function representing the invariant features of all efforts to obtain
true and informative answers to questions is to begin by considering the
epistemic utility functions appropriate to efforts to obtain true answers to
questions and to efforts to obtain informative answers to questions.
Suppose then that an investigator is interested in “the truth and nothing
100 1. SCIENCE AND INQUIRY

but the truth”. It would be a matter of complete indifference to such an


investigator whether the answer he adopts is informative or not as long
as it is true. And, even though he prefers true answers to errors, should
he commit an error it would be a matter of indifference to him whether his
error was informative or not.
Let T(H,t) be the utility of accepting H as strongest when H is true
relative to the objective of obtaining true answers to questions and T(H,f)
be the epistemic utility of accepting H as strongest when H is false. Given
that utility functions are definable up only to a linear transformation, any
utility function suitable for representing efforts to obtain true answers to
questions is a linear transformation of the following:

T(H, t) = 1
T(H, f) = 0

Let ET (H,e) be the expected epistemic utility of accepting H as strong¬


est relative to e where p(H,e) is some measure of the probability of H
given e used for determining fair bets.

ET(H,e) = p(H,e)T(H,t) + p(- H,e)T(H,f)


= P(H,e)

The injunction to maximize expected utility recommends, therefore,


picking that element of M as strongest bearing maximum probability rela¬
tive to the evidence. This sentence will always be the sentence S which is
the disjunction of all elements of U. In other words, if the aim of an in¬
vestigator is to obtain a true answer to a question without regard to how
informative it is, he should suspend judgement and refuse to accept any
conclusion unless it is entailed by his evidence.
Consider, by way of contrast, the investigator interested only in informa¬
tive answers. Let C(H,x) be an epistemic utility function representing his
cognitive goal. (Any linear transformation of C(H,x) will do as well.)

C(H,t) = cont(H,e)
C(H,f) = cont(H,e)

Observe that because truth is of no concern here, C(H,t) = C(H,f) for


every H in M.
Let EC(H,e) be the expected epistemic utility afforded by accepting H
as strongest when G(H,x) is the measure of epistemic utility.

EC(H,e) = cont(H,e).

Maximizing this function requires that the investigator accept C as strong¬


est—i.e., contradict himself.
The fact that pursuing truth to the exclusion of content and content to
the exclusion of truth lead to results in radical conflict with scientific practice
ISAAC LEVI 101

is a fairly decisive refutation of any claim that either desideratum alone is


the aim controlling the adequacy of scientific inferences.
If both truth and content are desiderata which scientists attempting to
answer questions seek to satisfy, it is possible to think of V(H,x) (the
epistemic utility function representing efforts to obtain true and informa¬
tive answers to questions) as some weighted function of the functions
T(H,x) and C(H,x).

V(H,x) = wT(H,x) + (1 — w)C(H,x) where O^w^l.19


V(H,t) = w + (1 — w)cont(H,e).
V(H,f) = (1 — w)cont(H,e).

Let EV (H,e) be the measure of expected epistemic utility when epistemic


utility is measured by the V-function.

EV(H,e) = wp(H,e) + (1 — w)cont(H,e)

Let r = 1/w and q = (1 — w)/w. As long as w is positive, both r


and q are positive and finite. Hence, F(H,e) = rEV(H,e) — q is a linear
transformation of EV(H,e) and may be used to represent the expected
epistemic utility in its stead.

F(H,e) = p(H,e) — qcont( — H,e).

When q = 0, 1 — w = 0. In that case, content receives 0 weight and


V(H,x) = T(H,x). When q = oo, w = 0 and V(H,x) = C(H,x).20
Thus, the index q represents the relative importance attached to truth as
compared to content. When q is high, an investigator tends to accept
stronger conclusions than when q is low. In this sense, an increase in the
value of q reflects a decrease in the “degree of caution” exercised by an
investigator. More will be said about degrees of caution shortly.
Let F(H,e) and F(G,e) be maximal in the sense that the F-values
accorded to H and G relative to e are not less than those accorded to any
other elements of M. Of course, F(H,e) = F(G,e). It can be proven that
when this happens F(HvG,e) is also maximal.21
Now the injunction to maximize expected utility leads to the recom¬
mendation to accept as strongest any element H in M such that F(H,e)
is maximal. It allows free choice among all such elements of M. But if ac¬
cepting H and accepting G are equally “good”, the plausible recom¬
mendation is to suspend judgement between the two. To do this, however,
is to accept the disjunction of H and G as strongest, which is a different
option from accepting H as strongest or accepting G as strongest or re¬
maining, like Buridan’s ass, in a state of indecision between these two
options. Given that accepting HvG as strongest will be a “best” option
whenever accepting H as strongest and accepting G as strongest are, a
unique recommendation in conformity with presystematic precedent is ob-
102 1. SCIENCE AND INQUIRY

tained by supplementing the injunction to maximize expected utility with


the following rule for ties:

Rule for Ties: Let H1? H2, . . . , Hn be elements of M such that F(H4e)
is maximal. Let G be that one of the Hj’s (there will be one and only
one) which is equivalent, given e, to the disjunction of all of them.
Accept G as strongest via induction from e relative to U.

The injunction to maximize expected utility and the rule for ties yield,
when epistemic utility is measured by the V-function, a rule for obtaining
answers to questions relative to given evidence which can be formulated
as follows:

(A) Divide the elements of the ultimate partition U into two sets: the set
K such that H is an element of K if and only if p(H,e) / qcont(— H,e)
=q/n (where n is the number of elements in U); and the complement
of K in U.

Accept as strongest via induction from e relative to U that element of


M which is the disjunction of the members of the complement of K.22

Rule (A) is a standard in terms of which inductive inferences can be


evaluated—provided that the question being studied has been sharpened
by identifying a list of relevant answers for the question with the aid of
an ultimate partition U and that a probability distribution is assigned to
U. When these conditions are satisfied, conformity to rule (A) has been
shown to be the best method of promoting the objective of obtaining true
and informative answers to the question. The opportunity for applying it
in real life inquiries is, of course, severely limited by the rarity with which
the probabilities required can be supplied. Nonetheless, rule (A) can be
used to shed light on certain features of scientific inference and, in the
case of certain kinds of statistical inference, can be directly applied. Con¬
sequently, there are some checks which can be made on rule (A). Space
will not permit review of all that has been done in the way of exploring
its ramifications. Some indication should be given, however, of what can
be achieved.

VI

The q-index of caution has been interpreted as reflecting the relative


importance which is to be attached to truth as compared to information.
Can any constraints be imposed on the values it may take?
One requirement that can be made is that q be positive. This insures that
the investigator accords some weight to content as a desideratum. Simi¬
larly, the requirement that q be finite insures that truth will be a desider¬
atum. Actually, however, a still stronger upper bound can be imposed upon
q. No matter how much freedom is to be allowed to an investigator who
ISAAC LEVI 103

has a strong “will to believe”—i.e., who is so incautious as to have a strong


desire for content as compared to truth—he should not be allowed to at¬
tach greater epistemic worth to errors than correct answers. This means
that V(H,f) should not be greater than V(H,t) for any H and G in M.
When this requirement is imposed, it can be shown that q cannot be
greater than l.23
The same argument can be put in a somewhat different way. When q
is greater than 1, there will be some probability distributions which will
yield, via rule (A), a recommendation to accept contradictions. When q
is less than or equal to 1, this is impossible. The fact that the same condi¬
tion prevents contradiction and reflects the maximum importance which
can be attached to content vis-a-vis truth consistent with the requirement
that error never be preferred to truth is itself an interesting feature of the
proposals which have been made.
There does not seem to be any other constraint on the value of q
which can be imposed with the same degree of plausibility. On the other
hand, if two investigators studying the same problem and having the same
evidence exercise different degrees of caution, they can disagree in the
conclusions they reach in the sense that one will accept as strongest a
weaker conclusion than the other.
This consequence does not appear, however, to be disastrous. In
certain theories of statistical inference, parameters having somewhat similar
properties appear (e.g., in assigning significant levels). In those cases,
there do not seem to be unique standard values which can be imposed.
Very frequently, the choice of values for these parameters is taken to
reflect the practical goals of the investigator. One of the virtues of the
approach adopted here is that it provides an interpretation of the q-index
which identifies its value with a certain epistemic as opposed to practical
goal.24
Thus, a case can be made out for maintaining that reasonable men can
differ in the conclusions they reach concerning a given problem relative to
given evidence. Such disagreement can be due to differences in the degrees
of caution exercised. The fact that this is a consequence of the proposals
made here does not constitute a difficulty for these proposals but, rather,
leads to another problem which they were not designed to handle: How
does one proceed in attempting to resolve such disagreements? The answer
seems to be: Look for more evidence.
One reason why a man might look for new evidence is that the con¬
clusion justified by the available evidence does not fully answer his question.
In the terminology used here, the sentence whose acceptance as strongest
relative to the available evidence e is warranted by e is not an element of
the ultimate partition U.
Suppose, however, that the evidence e did warrant accepting an ele¬
ment of U as strongest. Further evidence collection might still be desirable.
104 1. SCIENCE AND INQUIRY

In particular, if the investigator was justified in accepting an element of


U due to the fact that he was exercising a low degree of caution—i.e., he
was using a high value for q—the conclusion reached would not be taken
as justified by even moderately sceptical investigators. The second reason
for looking for additional evidence is to win agreement from investiga¬
tors whose disagreement stems from their using different values for q—i.e.,
their exercising different degrees of caution.
Of course, if agreement were to be expected of investigators exercising
any allowed degree of caution, then inquiry would continue until evidence
entailing the truth of an element of U was obtained—an eventuality which
is not always to be expected. But the arguments of science are not intended
to convince either utter sceptics (who use a 0 q-value) or virtual sceptics
(who use a q value near 0). It would be foolhardy to identify a certain
range of values as including all sceptics and virtual sceptics. But failure
to be precise on this score can be taken as recognition of the fact that
when inquiry terminates because further evidence acquisition is pointless,
the judgement of pointlessness is itself subject to revision in the future.
One of several reasons why the matter might be reopened is that a demand
has been made to provide arguments rationally convincing to more cautious
investigators.
Thus, even though a rational investigator might accept a conclusion
which another would not, both would agree that for that reason further
investigation is in order. The less cautious investigator might express his
views by saying that his conclusions are “speculative” or “tentative” and,
hence, stand in need of further “testing”.25

VII

One place where rule (A) can with some plausibility be directly applied
is in situations where an investigator wishes to make predictions regarding
the relative frequencies of certain kinds of events in a series of trials on the
basis of statistical assumptions.
Suppose, in particular, that some experiment having two outcomes
H and T is repeated n times and that the chance that H will appear on the
ith trial has the same value p for each i. What prediction should be made
regarding relative frequencies. The usually accepted method is to recom¬
mend adopting that conclusion which bears a probability near to l.26
This prescription does yield an intuitively acceptable result—to wit, the
prediction that the relative frequency lies in an interval around p, an inter¬
val which decreases in size as n increases. It also allows the conclusion that
the relative frequency is not p or, for that matter, any other specific value
in that interval.
Rule (A) can be applied to this problem in such a way as to yield the
ISAAC LEVI 105

intuitively acceptable result without the embarrassing accompaniments.


When the concern is to predict which relative frequency of the n + 1 pos¬
sible ones is the true one, the ultimate partition U consists of n + 1 hy¬
potheses of the form “The relative frequency is r/n” where r takes values
from 0 to n. An element of U is rejected according to rule (A) when its
probability is less than q/n. It can be shown that the strongest conclusion
acceptable according to rule (A) is a judgement to the effect that the rela¬
tive frequency lies in an interval around p. This interval decreases in length
as n increases and q remains constant.27
Thus, rule (A) yields the same intuitively acceptable result allowed by
the orthodox approach; but it does not permit the other objectionable con¬
clusions. For example, “The relative frequency is exactly p” will not be
rejected; for its probability will not be less than q/n even when q = 1.
One reason that rule (A) yields this attractive result is that it has been
applied under the assumption that the question is: “Which of the n + 1
relative frequencies is the correct one?” Had the question been “Is p
the true relative frequency?”, rule (A) would have recommended answer¬
ing that it is not. The reason for this is that in such a situation the ultimate
partition has only two elements: “p is the relative frequency” and “p is not
the relative frequency”, q/2 is less than or equal to .5. Hence, the first
element, “p is the relative frequency,” will be rejected.
It may be objected that rule (A) does not avoid the objectionable
results after all. By adjusting the ultimate partition U in suitable ways,
rule (A) will recommend all of the conflicting conclusions recommended
by the usual high probability rule.
This objection rests on either a misunderstanding or a rejection of the
approach to scientific inference on which the proposals made here is
based. According to that approach, when someone draws conclusions from
given evidence his concern is to provide an answer to a given question. The
legitimacy of his inferences depends, therefore, as much on the question
he has raised (and the list of relevant answers associated with it) as it
does on the evidence on the basis of which he reaches that conclusion. It
is widely recognized that an agent can reach conflicting conclusions rela¬
tive to different bodies of evidence. The same thing is true of conclusions
reached relative to different questions associated with distinct ultimate
partitions.
Thus, if one asks about any closed thermodynamic system whether
over a short period of time it will display entropy reversal, the answer must
surely be in the negative. On the other hand, if one asks “in the long run”
what the relative frequency will be with which such systems occur, the
answer will entail rejection of the statement that no such system displays
entropy reversal. The usual high probability acceptance rule will lead to the
conclusion that a system of inconsistent results must be accepted. The
inconsistency is an obnoxious one because it has been obtained relative to
106 1. SCIENCE AND INQUIRY

the same body of evidence. When rule (A) is used, on the other hand,
the inconsistency is no longer obnoxious because consistency is demanded
only in conclusions reached relative to fixed evidence and a fixed ultimate
partition.28
One way in which a high probability rule might be amended to meet the
requirements of rule (A) is to restrict the condition of acceptance to
elements of M bearing high probability. But this also leads to contradiction.
The second feature of rule (A) of vital importance in avoiding difficulties
of the sort under consideration is its refusal to recognize high probability
as a sufficient condition for acceptance. An element of U will avoid re¬
jection as long as its probability is greater than q/n. Since probabilities
meeting that requirement may be very small, the contradictories of ele¬
ments of U will often fail to be accepted even though their probabilities
are very high. This explains why in the example mentioned previously
the statement “the relative frequency will not be p” will fail to be accepted
as a good answer to the question about long run relative frequency.
According to rule (A), high probability is not only insufficient for ac¬
ceptance, it is not necessary either. This consequence could be avoided by re-
.5n
striding values of the q-index of caution to those less than or equal to--
n — 1
However, there seems to be no reason for this carrying the same conviction
as that which recommends restricting q-values to those less than 1.
Two points can be made, however, to mollify advocates of high prob¬
ability. The first is that when the ultimate partition contains only two ele¬
ments, then rule (A) takes high probability as necessary and sufficient for
acceptance.
The second point concerns the probability which an element of U should
have in order that it be accepted relative to the evidence e and, in addition,
that further efforts to obtain evidence be considered pointless. Advocates
of high probability as a necessary condition for acceptance may be viewed
as individuals who adopt a moderate but not excessively high degree of
caution. Consequently, if evidence acquisition is to terminate, the available
evidence should warrant their accepting an element of U. In this sense, a
high probability may be necessary (but not sufficient) for final acceptance.
According to rule (A), however, it is not necessary for tentative accept¬
ance.

VIII

Ernest Nagel has contributed much of the most searching criticism of


current theories of confirmation. His complaints do not rest solely on
reference to the fact that theories like Carnap’s are, at present, so simpli¬
fied and idealized as to be inapplicable in any direct way to interesting
ISAAC LEVI 107

scientific problems. Even when allowance is made for this, confirmation


theory fails to accommodate many familiar features of scientific inference.
The problems discussed by Nagel will be considered here. The first
is the well known 0-confirmation phenomenon. In one of Carnap’s simpli¬
fied languages where a member of his lambda family of confirmation
measures is employed,29 a universal generalization receives very low con¬
firmation in large universes and 0 confirmation in infinite ones even though
the available evidence consists of substantial numbers of exclusively con¬
firming instances. Nagel finds this objectionable.30
But even if this difficulty can be bypassed or overlooked, there is an¬
other. Carnap’s measures of confirmation are insensitive to the importance
of variety of confirming instances.31
Space will not permit a detailed examination of Nagel’s discussion here.
I think Nagel’s scepticism regarding the ability of probabilistic measures of
confirmation to mirror scientific practice is sound. This does not mean,
however, that a probabilistic confirmation theory of the sort suggested by
Carnap cannot contribute to a more adequate reconstruction. It is not,
after all, surprising that Carnap’s program should be inadequate as it
stands. For the aim of the Carnap program is to show how probabilistic
degrees of confirmation vary with the evidence. But, as Nagel has pointed
out, it is doubtful whether scientific conclusions are frequently if ever
evaluated in any direct way in terms of such probabilities—even in a
comparative sense.32 At least this much seems clear. There are no clear
and noncontroversial presystematic judgements about how scientists assign
probabilities to hypotheses to which decisive appeal can be made in order
to judge the adequacy of Carnap’s theory.
It is, indeed, true that a varied sample of confirming instances will war¬
rant accepting a hypothesis where an unvaried sample would not. It is also true
that the more varied the sample, the less tentative that acceptance would be.
It is a mistake, however, to suppose that these observations imply anything
as to how probabilities should mirror variety. Unless Carnap’s confirmation
theory is supplemented with some “inductive acceptance procedure” which
indicates how degrees of confirmation control rational acceptance, there
is no link between Carnap’s confirmation theory and science. In the absence
of an inductive methodology, as Nagel has explicitly observed, “objective
data are . . . largely lacking for judging the adequacy of his inductive logic
for its ostensible purpose.”33
One possible way to remedy at least part of this deficiency is to combine
Carnap’s confirmation measures together with rule (A) and explore the
consequences. In order to handle infinite universes, rule (A) would have
to be slightly modified. To avoid complications in the exposition, therefore,
attention will be restricted to finite but large universes.
Let h be some sentence of the form (x x Px->Rx) where P and R are
molecular predicates in one of Carnap’s universes. Assume that the ultimate
108 1. SCIENCE AND INQUIRY

partition consists of all structure descriptions compatible with the in¬


vestigator’s available evidence. In the subsequent discussion, it shall be
assumed that the total evidence consists of a complete Q-description of a
number s of observed individuals. We shall assume that the investigator
has adopted a definite value for the q-index of caution. Under these cir¬
cumstances, the following theorems can be proven (no proof will be given
here):

(1) If the evidence e implies that all s individuals are confirming instances
of h and N is the size of the universe, then there is a positive value of
the lambda parameter such that when that value is used to generate a
measure of confirmation, h will be accepted relative to e. The maximum
such value for lambda decreases with increasing N and increases with s.

(2) Let e1 imply that s individuals are confirming instances of h and that
they all share the same Q-predicate. Let e2 imply that s individuals are
confirming instances of h but that they are as heterogeneous as possible
with respect to Q-properties consistent with their being confirming in¬
stances of h. There is a positive value of lambda such that when that
value is used to generate a measure of confirmation, h will be accepted
relative to e2 but will fail to be accepted relative to er This theorem fails
only in the case where N = s + 1.

Theorem 1 shows that the 0 confirmation phenomenon is not disastrous


when conditions for rational acceptance are at stake. Even if h bears low
probability relative to the evidence, it may, nonetheless, be accepted.
Theorem 2 shows that confirmation measures do display some sen¬
sitivity to variety when combined with rule (A) to yield recommendations
regarding legitimate inference.
Needless to say, much more needs to be done to assess the significance
of these theorems. They do go part of the way, however, towards meeting
Nagel’s strictures.

NOTES
1. Work on this paper was done while I was a Guggenheim Fellow and Fulbright
Scholar at the the London School of Economics in 1966-67.
la. Ernest Nagel, The Structure of Science, New York: Harcourt Brace (1961),
p. 14.
2. See, for example, The Structure of Science, p. 13.
3. See, for example, R. Rudner, “The Scientist qua Scientist Makes Value Judge¬
ments, Philosophy of Science 20 (1953), pp. 1—6 and R. C. Jeffrey’s commentary
“Valuation and Acceptance of Scientific Hypotheses,” Philosophy of Science 23
(1956), pp. 237-246.
4. The Structure of Science, p. 497.
5. The Structure of Science, p. 498.
6. This question should not be confused with the problem of “vindicating” in¬
duction. Vindications purport to justify inductive reasoning by showing that certain
“inductive policies” best promote the aims of science. Unlike the arguments to be
presented here, however, vindications are intended to answer Hume’s problem This
means that the assumptions on which they are based are held to be necessary
incorrigible, indubitable or the like.
This paper is not concerned at all with Hume’s problem. The problem is to pro¬
vide a systematic account of standards for legitimate inference which relates con-
ISAAC LEVI 109

formity to such standards with scientific objectives. The fundamental assumptions of


this account are, like the assumptions of any scientific theory, subject to criticism
and liable to revision.
7. Observe, however, that very little hinges on whether the interpretation of
probabilities involved is “subjective”, “logical” or “frequentist” as long as it is
assumed that probabilities can be used to determine “fair betting quotients.”
8. The quotation marks are intended as a nod in the direction of those who
may wonder whether we are always in a position to decide what to accept or to
believe. No matter how this question is settled, it remains true that inferences are
subject to criticism; and this implies comparison of the conclusions reached with
alternative conclusions which might have been reached. This supplies enough of
an analogy to deliberate decision-making to warrant using the formal machinery of
decision theory in connection with scientific inference.
9. Although occasional references to an approach similar to Hempel’s were
made earlier, Hempel’s discussion of epistemic utility is the first detailed proposal
of its kind which, to my knowledge, has been published. See “Inductive Inconsist¬
encies,” Synthese 4 (1960), pp. 462-469, and “Deductive-Nomological vs. Statistical
Explanation,” Minnesota Studies in Philosophy of Science 3 (1962), pp. 149-163.
My own first proposal along these lines [“On the Seriousness of Mistakes,” Philoso¬
phy of Science 29 (1960), pp. 47-65] was written in ignorance of Hempel’s work.
Aside from the proposals made by Hempel and myself in the papers just mentioned,
the only other work along similar lines is the presentation of an epistemic utility
function by me in “Corroboration and Rules of Acceptance,” British Journal for
the Philosophy of Science 13 (1963), pp. 311-313 and proposal of the same function
by J. Hintikka and J. Pietarenen, “Semantic Information and Inductive Logic,”
Aspects of Inductive Logic, ed. by Hintikka and Suppes, Amsterdam, North Holland
(1966), pp. 105-111.
10. “Inductive Inconsistencies,” p. 464.
11. “Inductive Inconsistencies,” p. 464.
12. “Inductive Inconsistencies,” p. 465.
13. In short, I do think that scientific inquiries terminate on occasion and the
conclusions reached are taken to be conclusively established. This does not imply
that such inquiries cannot be reopened or that claims which were erstwhile considered
conclusively established are immune from revision. But there are occasions where
inquiry is cut off not because scientists become tired or due to economic, political
or moral difficulties but because the problem has been as decisively settled as anyone
could reasonably wish. In such cases, it would be foolish to continue inquiry into
the same problem until, by accident or due to the results of other inquiries, some
good reason is provided for doing so.
14. The relevant answers to some questions cannot be generated, so it would
seem, from finite ultimate partitions. For example, consider cases where the value
of some parameter having one of a continuum of possible values is to be estimated.
I think that these cases can be treated with the aid of modified versions of the pro¬
posals made here; but introducing the modifications would obscure the argument
at this point.
More critical are those cases, such as when one is choosing between two or three
theories, where the ultimate partition is not exhaustive. It is, of course, always pos¬
sible to insure that the ultimate partition is exhaustive by including the “residual”
hypothesis (a term used by G. L. S. Shackle) which asserts that none of the other
elements in U are true. This leads to other complications, however, which are men¬
tioned in footnote 18.
Finally, whether one considers a given question Q to be the same when the
ultimate partition U is changed to another U' is largely a verbal matter. The critical
point is that the cognitive objective which controls the selection of an answer—i.e.,
the making of an inference—changes with a change in the ultimate partition.
15. Hempel’s list of cognitive options is the same as the one I proposed in “On
the Seriousness of Mistakes”. The criticisms levelled against him here apply equally
to my own paper.
16. The set of sentences equivalent given e to elements of M constitute a boolean
field. The contradictory — H of some sentence H in M can be understood to be
either the contradictory of H strictly understood or that element in M equivalent
given e to the contradictory. A similar double interpretation can be given to the
disjunctive and conjunctive connectives.
17. In addition to making these assumptions, which I no longer accept, some
further remarks about Hempel’s content measure and that used by Hintikka and
Pietarinen and myself are in order. Hempel took as his measure of content, a
110 1. SCIENCE AND INQUIRY

measure of the content added by accepting H to that already afforded by the evi¬
dence. Subject to the reservations indicated in the text, I now agree with his ap¬
proach. However, in “Corroboration and Rules of Acceptance” and “Decision Theory
and Confirmation,” [The Journal of Philosophy 58 (1961), 621-625], I raised an
objection against it. Because of this, I used (in “Corroboration and Rules of Ac¬
ceptance”) an absolute measure of content, and Hintikka and Pietarinen have done
likewise. I now think that my objection can be avoided and, subject to the modifica¬
tions mentioned in the text, I am now adopting a measure of “content added.”
18. This approach does seem entirely appropriate to large numbers of cognitive
decision problems. It does not seem plausible when applied to cases of choosing be¬
tween theories where one element of U is a “residual hypothesis” x (see footnote
14). The reason is that, although the residual hypothesis is a strongest consistent
answer, it is not a theory which can be used for explanation. In such cases, it
seems plausible to accord a much lower content value to the residual hypothesis
than to the others. This qualification, however, does not mean that the measure of
content for such cases is some logical measure in the sense just given or that it is
correlated with the same measure function used to measure probabilities. Although
I have no definite proposals to make as to how content values should be assigned
when the choice is among theories, most of the formal results obtained in this dis¬
cussion are applicable to such cases, insofar as they are at all plausible.
19. Too much significance should not be attached to the numerical value of
the weight w. Keep in mind that any utility function which is obtained by linear
transformation of the T-function is an adequate utility function for representing the
goal of obtaining true answers. A similar remark obtains regarding the C-function.
The 0 points for measuring utility in these two cases is not of any real importance
here, but the arbitrariness in the choice of unit is. Thus, let T' = aT and C = bC
where a and b are positive constants. If V = wT + (1 — w)C and V' = wT +
(1 — w)C it can be shown that V' is not a linear transformation of V. Let x =
aw + b(l — w). x is positive. Furthermore, there is a number between 0 and 1,
w', such that V' = x(w'T + (1 — wQC). V' is thus obtainable via linear transforma¬
tion from a weighted average of T and C. But, in the general case, the weight w'
will not equal the weight w. Hence, V' cannot be obtained by linear transformation
from V.
To be specific, if w = 1 — w, it would be misleading to conclude that truth and
content are weighted equally; for by a suitable change of units for measuring the
utility of truth and the utility of content, one would obtain an equivalent utility
function in which the weights w' and 1 — w' were not equal. The weight w does
have the following significance. As w increases, the relative importance of truth as
compared to content decreases.
20. If the T-function and C-function are transformed by change of unit as in
footnote 18, the weights would have to be changed from w to w\ Suppose, in par¬
ticular, that V'= w'aT + (1 — w')bC and that it is obtainable by multiplying V
( = wT +(1—w)C) by a positive factor, q by definition equals (1 — w)/w. But
this equals b(l — w')/aw'. Thus, the significance which q has in the function F(H,e)
can be detached from the choices of units for measuring the utility of truth and
content. Thus, when q is greater than 1, it can be shown that some errors are pre¬
ferred to some correct answers, whereas for q-values less than or equal to 1 this is
not so. This is so, no matter what the units for measuring the utility of truth and
the utility of content happen to be.
21. Proof: F(HvG,e) = p(H,e) + p(G,e) - p(H&G,e) - qcont( - H,e) —
qcont( - G,e) + qcont( - (H&G),e) = F(H,e) + F(G,e) - F(H&G,e). We are
given that F(H,e) = F(G,e) and that these values are maximal. Hence, F(HvG,e) +
F(H&G,e) = 2F(H,e). Since neither term on the left hand side of the equation can
be greater than F(H,e), they both must be equal to it. This proves the theorem.
22. Let H and G be elements of U. Because they are exclusive given e, F(H&G,e)
= 0 and, in virtue of the argument of footnote 21, F(HvG,e) = F(H,e) + F(G,e).
This means that the sentence E whose acceptance as strongest will be recommended
by the injunction to maximize expected utility will be that disjunction of elements
of U, HivH2v . . . vHk, such that for each H, in the disjunction F(H,,e) is non¬
negative. But these Hi’s form the complement of the set K in U mentioned in
rule (A).
23. V(S,t) is less than or equal to V(H,t) for every H in M, and V(G,f) is less
than or equal to V(C,f) for every G in M. Consequently, the requirement that no
error be preferred to any correct answer is fulfilled if and only if V(C,f) is less
than or equal to V(S,t). This is true if and only if (1 - w) is less than or equal
to w. And this is equivalent to saying that q is less than or equal to 1.
ISAAC LEVI 111

24. I first introduced the notion of degree of caution in “On the Seriousness of
Mistakes” and used it there as an interpretation of levels of significance. (See, in
particular, pp. 56-57 and p. 63 of that article.)
25. The ideas mentioned here can be exploited to construct a measure of degree
of acceptance which has a formal structure akin to that found in G. L. S. Shackle’s
theory of potential surprise. (See, in particular, Decision, Order and Time in Human
Affairs, chapters VII-XII.) I think that Shackle has captured an important notion
of degree of acceptance or degree of belief which has been neglected due in part to
certain dogmas which surround probabilistic interpretations of degrees of belief.
26. See Harald Cramer, Mathematical Methods of Statistics, Princeton: Prince¬
ton U. Press (1946), pp. 148-150, and C. G. Hempel, “Deductive-Nomological vs.
Statistical Explanation,” pp. 141-149. Hempel’s central concern is with statistical
explanation as opposed to statistical prediction. The critical remarks to be made
here regarding the high probability acceptance rule for statistical prediction apply
to statistical explanation as well and should, if they are sound, lead to emendations
of Hempel’s account parallel to the sort mentioned in the text for statistical pre¬
diction.
27. No proof will be offered here. I do have proofs, however. Unfortunately, I
am unable to provide a rule for easily computing the shortest acceptable interval
around p for any given q.
28. Henry Kyburg has suggested avoiding inconsistency by tampering with the
laws of deductive logic. This desperate strategem has been avoided here by a re¬
vision of methodology. Instead of requiring that the set of sentences which an in¬
vestigator accepts relative to his evidence ought to be deductively closed, the
proposals made here assume that the set of sentences which an investigator accepts
ought to be deductively closed relative not only to his evidence but to the question
he is considering as well.
29. Space will not permit a review of the rudiments of Carnap’s confirmation
theory to which reference will be made below. The reader will be expected to under¬
stand some of the terminology from the Logical Foundations of Probability and
The Continuum of Inductive Methods. I hope, however, that the main philosophical
points are clear.
30. E. Nagel, “Carnap’s Theory of Induction,” The Philosophy of Rudolf Carnap,
La Salle, Ill.: Open Court (1963), pp. 799-805.
31. E. Nagel, “Carnap’s Theory of Induction,” pp. 806-810 and Principles of the
Theory of Probability, Chicago (1930), pp. 68-71.
32. E. Nagel, “Carnap’s Theory of Induction,” pp. 787-788.
33. E. Nagel, “Carnap’s Theory of Induction,” p. 786.
CONCEPTS OF STATISTICAL EVIDENCE1
Allan Birnbaum

1. INTRODUCTION

1.1 General Introduction


A variety of concepts of direct empirical evidence (or experimental or
observational evidence) have played roles in the development of mathe¬
matical statistics. We shall refer to these as concepts of statistical evidence.
Some of these concepts are related specifically to errors of observation and
their treatment, others to methods of testing statistical hypotheses. Still
others originated in connection with such problems but have assumed more
general forms. Some of these concepts are indicated by the terms “pre¬
cision of a measurement,” “significance level,” “sufficient statistic,” and
“the likelihood principle.” This paper provides a self-contained expository
perspective on the foundations of statistical inference, with particular ref¬
erence to such concepts, and discusses some problems of current interest.
The formal (mathematical, syntactic) aspects of these concepts are
generally represented rather clearly by definitions referring to specified
mathematical-statistical models of experiments. It is the semantic (extra-
mathematical) aspects of these concepts which have appeared problematic
to many investigators actively concerned with applied statistical methods
and their theory, recurrently for two and a half centuries. (These problems
are rather distinct from those concerning interpretations of probability, as
will appear below.) Most of these concepts have either emerged or
undergone notable evolution since the 1920’s. Each decade of this period
has seen intensive mathematical and conceptual developments, accompanied
by increasing applications of mathematical statistics in various research dis¬
ciplines and by vigorous critical analysis. During this period the possibility
and the relevance of developing precise general concepts of statistical
evidence have been strongly affirmed by some, and denied by others, from
several standpoints.
Important landmarks in this problem-area are provided by three general
issues:

1. Does there exist an appropriate Bayesian “logic” (in which models


of experiments are complemented by an investigator’s prior prob-
112
ALLAN BIRNBAUM 113

ability concepts)? An affirmative answer entails, as we shall indicate,


that the main questions to be considered are not problematical, but
have immediate clear resolutions, in which there is no important role
for concepts of statistical evidence.
2. Do there exist appropriate decision-theoretic models (in which the
model of an experiment is complemented by a specified set of alter¬
native decisions or actions contemplated by an investigator, and his
loss or utility function)? An affirmative answer clearly entails the
irrelevance of the main questions to be considered, as we shall indicate.
3. Do the main features of analysis and interpretation of observational
results elude accurate and useful representation by precise general
concepts of statistical evidence linked to experimental models? An
affirmative answer of course implies a serious deficiency of realism
and relevance of the main questions to be considered below.

Thus the problem-area of main concern here may be described as that


of determining precise concepts of statistical evidence (systematically
linked with mathematical models of experiments), concepts which are to
be non-Bayesian, non-decision-theoretic, and significantly relevant to statis¬
tical practice.
Definite concepts of statistical evidence appeared earliest with Arbuthnot
(1710) and Bernoulli (1713). The apparent inadequacy of such concepts
led to the development of the additional concept of prior probability, ten¬
tatively by Bayes (1763) and decisively by Laplace (1774, and e.g. 1820).
In the present century, renewed and intensive concern with non-Bayesian
concepts of statistical evidence has been a central feature of several major
developments in mathematical and applied statistics. But once more the
apparent inadequacy of such concepts has tended to encourage other
kinds of developments: Bayesian (e.g. Jeffreys, 1961); decision-theoretic
(Neyman, 1957, 1962, Wald, 1942, 1950); both Bayesian and decision-
theoretic (Savage, 1954); or avowedly eclectic, informal, or pragmatic,
avoiding systematic concern with such questions (e.g. Tukey, 1960, 1962).
Several questions about meanings of terms are involved in the three
issues indicated above. Judgments that various concepts are not meaningful
in appropriate ways underlie rejections of:

(i) prior probability concepts (e.g. Boole, 1854, Venn, 1888, Fisher,
1921, 1950, and Neyman, 1962, all interestingly foreshadowed by
Bernoulli, 1713);
(ii) decision-theoretic concepts (e.g. Cox, 1958, p. 354);
(iii) proposed precise concepts of statistical evidence in general (e.g.
Tukey, 1962).

Within our delimited problem-area (that of precise non-Bayesian non¬


decision-theoretic concepts of statistical evidence), further questions about
114 1. SCIENCE AND INQUIRY

meanings of terms appear, and major developments have turned upon two
kinds of criteria for appropriateness of form and meaning of such concepts:
(a) Judgments that certain aspects of statistical evidence are irrelevant
to evidential meaning. These are expressed below as axioms of statistical
evidence. (The principal ones, sufficiency, conditionality (ancillarity), and
likelihood, stem from the work of Fisher, as indicated below.)
(b) A requirement of adequate operational content, in terms of suitably
controlled probabilities that evidential interpretations will be strongly mis¬
leading. This will be discussed below as the main feature of the confidence
concept of statistical evidence. (As indicated below, this is a concept of
evidence which has developed more or less systematically in applications
of mathematical statistics, particularly of the theories of Neyman and
Pearson.)
These criteria, individually and in various combinations, have played
key roles during recent decades in constructions of statistical inference
theory and in criticisms of such theory. This paper is devoted primarily
to a presentation of some main features of these developments, which
seem to have intrinsic as well as historical interest. These developments
include an apparently decisive negative outcome.
It has seemed to some (including this writer) that any adequate concept
of statistical evidence must meet at least certain minimum versions of
both of the criteria just indicated. But the difficulties of developing such a
concept have become increasingly apparent, and it now seems rather clear
that no such adequate concept of statistical evidence can exist.

1.2 Technical Introduction: Statistical Models of Experiments, and Models


of Statistical Evidence.

The concepts and issues to be discussed are represented amply by


reference to very simple models of experiments. We shall denote by E any
specified model of an experiment. The forms of such models, and examples,
are discussed in detail below. Here we merely mention these examples:
tossing a given newly-bent coin 50 times, and observing the number of
heads appearing; or observing the number of apparent cures in 50 clinical
trials of a new drug.
We shall denote by x any specified observed outcome of an experiment E.
Then (£,x) is a model of an experiment and its observed outcome, which
we shall call a model of statistical evidence. Referring to the examples
mentioned, (£,40) represents that 40 heads were observed in 50 tosses of
the coin, or that 40 apparent cures were observed in 50 clinical trials. Even
such simple cases of statistical evidence cannot be interpreted or used with¬
out raising at least implicitly the questions and incompletely-resolved issues
to be discussed.
In later sections we shall discuss Bayesian models, denoting by G any
ALLAN BIRNBAUM 115

specified prior probability distribution. Then a Bayesian model of an ex¬


perimental situation is denoted by (G,£); and a Bayesian model of an
experimental situation and its outcome is denoted by (G,E,x). The latter
represents an instance of statistical evidence in a Bayesian formulation;
and, as we shall review, this represents for typical purposes just a posterior
probability distribution. (To illustrate by reference to the examples men¬
tioned, {G,E,40) might include, as a main feature, that the posterior prob¬
ability is negligible (e.g., less than .001) that the long-run rate of apparent
cures would be below 60 percent; or that the probability of heads is below
70 percent. Given evidence, with different priors, yields generally different
posteriors.)
In later sections we shall also discuss decision-theoretic models, denoting
by D any specified set of alternative decisions, D — {dk}; and by U any
specified loss (or utility) function. Then a decision-theoretic model of an
experimental situation is denoted by (E,D,U); and a specified outcome of
such a situation is denoted, for example, by d,,. Neither models nor con¬
cepts of statistical evidence are essentially involved in statistical decision
theory. But the confidence concept of statistical evidence, discussed below,
has developed in significant, though incompletely formalized, connection
with decision theory. Illustrating by reference to our examples, d3 could
denote a decision to call the coin biased and not to use it in a game
of chance; or a decision to call the drug “promising” and to adopt it for
larger-scale investigation and/or clinical use.
Similarly, a Bayesian decision-theoretic model of an experimental situa¬
tion is denoted by (G,E,D,U); and a specified outcome, for example, by d3.
Our problem-area may be indicated, with reference to our examples, by
the question: What precise general concepts, not dependent on concepts
of prior probability or decision theories, are available for interpreting and
using the statistical evidence (£,40) about a newly-bent coin, or about
clinical trials of a new drug?
A statistical model of an experiment E is constituted by a specified set
of alternative probability models of an experiment. It suffices here to con¬
sider only discrete probability models: By S = {*} we denote the sample
space, or set of possible outcomes, of an experiment £. We denote the
probability of a specified outcome x of E by / = f(x), a discrete probability
density function (henceforth abbreviated pdf). Then each subset B of S has
a probability given by

Prob(B) = 2 / (*).
xeB

A discrete probability model of an experiment is represented by (S,/),


where S and /(*) are specified. (The axioms of probability in the discrete
case require only / ^ 0 and Prob(S) = 1. (Cf. e.g. Feller, 1968, 1966.)
116 1. SCIENCE AND INQUIRY

For our examples, the sample space is 5 = {0,1,2,50}; and, if the coin
is unbiased (or if the long-run rate of apparent cures is 50 percent), then
the pdf has the binomial form

SO 1 50
f(jt) = ( ) (-) , for * = 0,1,2,. . ., 50.

In contrast with probability theory, the mathematical and conceptual


problems specific to mathematical statistics concern:

(i) specified sets of alternative probability models, representing the


incompleteness of assumed knowledge of the (probabilistic) struc¬
ture of an experimental situation; and
(ii) problems of appropriate interpretation or use of any specified ob¬
served outcome x.

A set of probability density functions is represented by a function f=f(x,6)


which, for each fixed value of 6, is a pdf. By O = {0} we denote the specified
range of 6. O is called the parameter space, and each element 0 is called a
parameter point. Thus a statistical model of an experiment is represented
by E = (0,5,/), where 0,5, and / are specified. In our examples, a possible
statistical model is the set of probability models represented respectively
by the binomial pdf’s

50 50—x
f(x,6) = ( ) ex (1-6) ,x = 0,1, . . . 50, for 0^0^ 1.
x

It will suffice to base most of our discussion on finite models of statistical


experiments, that is, models in which both the sample space and the
parameter space contain finite numbers of points. Any such model can be
represented when convenient by a rectangular array of its probabilities.
For example we may write

to indicate that a particular statistical model of an experiment, Eu is con¬


stituted by the given array of numbers, in which each row represents one
of the probability models which constitute the statistical model. (Each row
contains non-negative numbers whose sum is unity. Any such array is
called a stochastic matrix.) Thus the number of elements in a row (i.e.,
the number of columns) is the number of points in the sample space, two
in this example. And the number of rows (also two here) is the number of
points in the parameter space.
It is convenient to adopt the usual generic matrix notation for such
arrays,
ALLAN BIRNBAUM 117

p 11 p 12 • • pu
P21 P22 P23

L P12 ■ • pij

Here J is the generic number of sample points, I is the number of parameter


points, and the generic probability ptj denotes the probability of sample
point j under the assumption that the parameter point is i. Thus in Ex we
have P12 = .25; while the two outcomes j = 1 and 2 are equally probable,
on the assumption that i— 2. The relations with the more customary nota¬
tion above are: x — j, 6 = i, and

/(/,/) = Prob(/'ji) = Pij


= f(x,8) - Prob(x|0) = pgx.

Another example to which we shall refer below is a finite model of a


kind which (despite its simplicity) has found important use in linkage ex¬
periments in classical (Mendelian) genetics. These are experiments to de¬
termine whether two genes are located on the same chromosome. (Cf. e.g.
Bailey 1961.) When certain inbred strains of mice are crossbred, each of
their progeny will have a certain trait A with probability 14 if a certain two
genes lie on different chromosomes, but with probability Vz if the genes lie
on the same chromosome. (Genetical judgment, theoretical and experi¬
mental, is of course involved in adopting such a simple model as an
adequate basis for some interpretations of results. We return to these im¬
portant considerations below.)
The model Ex above represents a miniature experiment in which a single
progeny is observed for presence or absence of trait A. Here i = 1 denotes
the hypothesis that the genes lie on different chromosomes, and i = 2
denotes the alternative hypothesis; and j = 1 or 2 denotes respectively ab¬
sence or presence of trait A. If A is observed, then the model of statistical
evidence is (Eu2) or

An experiment in which just two progeny are observed has the model

_ f 9/16 6/16 1/16 1


2 { .25 .50 .25 J ;

and an experiment in which 50 progeny are observed has the model


118 1. SCIENCE AND INQUIRY

where

P,„= (,5°1) (.25)'-' (.75)»-«-*»,

Pm =<;_!> f-5)'0 for j = 1,2,.... 51.

Once more we have binomial probabilities, with / — 1 here as the number of


A’s observed.

2. SOME CONCEPTS OF STATISTICAL EVIDENCE

Several of the concepts to be discussed concern statistical evidence in a


way which is simple though abstract. It is convenient to formulate and
discuss these concepts by use of the notation Ev(E,j), which may be read
as “the content, or the evidential meaning, of the statistical evidence (is,/).”
Each of these concepts is an assertion that certain instances of statistical
evidence are equivalent (or have the same evidential meaning), without
any more explicit indication of the nature of statistical evidence. The
possible value of such modest and indirect steps is recommended precisely
by the long-evident difficulty of specifying in clear positive terms the nature
of statistical evidence. We shall indicate an assertion of equivalence between
given instances of statistical evidence (is,/) and (E',j') by writing Ev(E,j) =
Ev(E',n.
Other concepts to be introduced (notably the confidence and likelihood
concepts) specify statistical evidence in more positive terms.

2.1 The Sufficiency Concept


The main content of the sufficiency concept of statistical evidence is the
assertion, which we shall denote by (Si), that any two outcomes j, j', of a
given experiment E represent equivalent evidence if their probabilities are
proportional under respective parameter values; that is,

(Si): Ev(E,j) = Ev(E,j') if, for some c > 0,


Pij = cpiy for each iefi.

For example in the model of the genetical example above,

(9/16 6/16 1/16


£2 = (Pa) = = 1/16 •
11/4 1/2 1/4 _

no two possible outcomes would, under (Si), represent equivalent evidence,


since no two columns of the model of E2 are proportional.
ALLAN BIRNBAUM 119

But in the model

£ = <*r> = 1/16 {94 l l \}

the second and third columns are proportional (in fact identical); the
condition of (Si) is met, with c = 1: pt2 = pi3, for i = 1 and 2. Thus (Sx)
implies Ev(E,2) = Ev(E,3).
To exemplify the extramathematical content of this concept, consider the
genetical experiment described above which is represented by E2. It is quite
possible in practice to complement such an experiment, in case its outcome
is / = 2, by tossing a fair coin and observing whether it falls heads or
tails. Precisely because this addition to the genetical experimental procedure
seems so clearly trivial, and irrelevant to the statistical and genetical evi¬
dence in each possible outcome, it is interesting and useful to consider
formally the model of the augmented experiment and to compare it with
the original model: The new sample space has four points, which we may
label by

1, if / = 1,
„ _ 2, if j — 2 and heads is observed,
- 3, if j = 2 and tails is observed,
4, if j = 3.

It is readily verified that the new model has the form E = (p\y) given
above. The indicated relations between probabilities in the respective
models are

Pa = p'ii, Pi2 = p'i2 + p'i3, and Pi3 = p'i4, for i = 1 and 2.

Thus E is seen to be another model of the genetical experiment repre¬


sented by E2, including an addition which is trivial but harmless. The addi¬
tion cannot hinder us in recognizing that the two outcomes /' = 2 and 3
of E represent the same genetical and statistical evidence, so long as we
think of those outcomes as differing only in a way that is irrelevant to
evidential meaning. This equivalence is conveniently suggested by the
respective labels: (/ = 2, heads) and (/ = 2, tails).
However since we consider the model E as valid (even if unparsimoni-
ous), we may also represent it in our usual generic notation by E — (pV).
What formal aspects of the two models of statistical evidence (E,2) and
{E,3) correspond to their (agreed or postulated) equivalence of evidential
meaning? The answer is just the condition (Si) given above: In our ex¬
ample, the use of a fair coin corresponds to the value c = 1 in (Si); use of
a coin with probability c/(c+ 1) of heads corresponds to the general case.
Thus (Si) may be described heuristically as the concept that recognizable
pure “noise” is irrelevant to evidential meaning.
We have illustrated the extramathematical content of concept (Si) by
120 1. SCIENCE AND INQUIRY

an example of the kind usually used to illustrate its plausibility and to sup¬
port its adoption. In the example, a given model was made more com¬
plicated, in a way considered trivial and irrelevant to evidential meaning, as
expressed by (Si). Conversely, adoption of (SO supports certain con¬
venient commonly-made simplifications of models used in practice: If just
the evidential meaning of outcomes is of interest, there is no need to dis¬
tinguish between outcomes which are equivalent under (Si), and they
may be treated as equivalent alternative indications of one outcome. Such
treatment is naturally represented by a simpler new model, with a smaller
sample space, determined from the original model.
An example is provided once more by the genetical experiment: Instead
of the model E2 above, we may obtain a different, more complete model
of the same experimental situation as follows: If the progeny are labeled by
u = 1, 2, respectively, then the experiment has the four possible outcomes

(yi,y2) = (0, 0), (o, i), (i, o), or (l, i),

where yu = 1 or 0 according as progeny u has trait A or does not. Denoting


these sample points respectively by j — 1, 2, 3, 4, the more complete model
is seen to have the form E = (p' iy) given above. Thus this more com¬
plete and accurate model E of the genetical experiment is identical with
the model we obtained earlier by augmenting E2 in a way considered trivial
and irrelevant for evidential meaning. Thus simplification of E to the
form E2 is parsimonious and without effect on evidential meanings. The
indicated correspondence between the sample spaces and the probabilities
in the models E2 = (ptj) and E = (p' ir) is summarized by

T, for j' = 1,
i = /(/') 2, for j' = 2 or 3,
3, for f = 4,

and by the equations given above relating Pi/s and p'i/s.

The concept of evidence denoted by (Si) above, together with its conse¬
quences for simplification of models just illustrated,2 constitute the suf¬
ficiency concept of statistical evidence. This is conveniently formulated as
follows:

(S): The sufficiency axiom: If E = (py) is determined from E' = (p^ ) by any
function j=j(j') which takes distinct values on any two values of j' which
do not label proportional columns of E', then for each (E', j') we have
Ev(E',j')=Ev(E,j) where j=j(JD-

The simplification of models by use of sufficient statistics is sometimes


justified by appeal to operational considerations which may be judged more
or less distinct from the appeal to concepts of evidence (or of evidential
irrelevance) just illustrated. This is particularly the case in decision-theoretic
ALLAN BIRNBAUM 121

discussions, where concepts of evidence play no specific role. In terms of


the genetical examples above, such an operational justification of replacing
E by the simpler E2 is the following: The genetical experiment, when
represented just by the simplified model E2, can be complemented if desired
in an operationally realizable way by the toss of a fair coin, as described
above, so as to yield an experiment whose model is identical with the
original E. Here the two perspectives, operational and evidence-conceptual,
seem not only compatible but even complementary and mutually support¬
ing. Unfortunately this is not the case more generally, as we shall see.
The sufficiency concept was formulated by R. A. Fisher (1920) in con¬
nection with relatively specific problems of estimation, namely comparisons
of alternative estimators of precision of measurements under the normal
error model. (It played a distinct though implicit role earlier, in certain
simplifications of models like the one just illustrated.) Like most of the
other precise general concepts of evidence to be discussed, the main con¬
tent of the sufficiency concept is linked rather delicately with the forms of
experimental models. An investigator’s judgment that a given model is
approximately valid and useful often entails no very precise guidance in
terms of “approximately sufficient” statistics. This is another important
kind of consideration which tends to support eclectic and informal practice
in applied statistics. (For some historical and applied-statistical methodo¬
logical comments, and further references, cf. Fisher, 1950, p. 2.757a, and
Tukey, 1960, pp. 471-473.)
It is convenient to introduce here some definitions required below: Any
function which depends only upon a sample point is called a statistic.
Examples are each of the functions j — /(/') considered above. Any statistic
meeting the condition in (5) is called a sufficient statistic.3 Each of the
statistics /(/') we have considered thus far is sufficient, and so is the trivial
function /(/') = /'. A statistic which is not sufficient, in

is /(/') = 1 for /' =1,2. (This function fails to take distinct values on values
/' labeling nonproportional columns.) Thus the sufficiency axiom (5) may
be stated thus: If /(/') is a sufficient statistic in E, and if j(j\) = /(A),
then Ev(E',j'o) = Ev(E,j) where / = j(j\).

2.2 The Confidence Concept


We shall refer to the concept of statistical evidence by which estimates
having the confidence region form are usually interpreted, as the confidence
concept. The formal theory of confidence region estimation does not include
reference to any concepts of statistical evidence, as has been emphasized
122 1. SCIENCE AND INQUIRY

consistently by the theory’s inaugurator, Neyman (1937, 1938, 1962).


Thus when a confidence region estimate is interpreted as representing
statistical evidence about a parameter point of interest, an investigator
or expositor has adjoined to an application of the formal theory his concept
of statistical evidence, which we refer to as the confidence concept.
We illustrate in terms of the important binomial case represented by
our examples. Such examples are often treated by reference to charts which
appear in many texts and manuals of statistical methods, given originally
by Clopper and Pearson (1934). Our brief discussion of details may be
complemented by reference to such a chart (or to equivalent tables of the
binomial distributions or of the approximating normal distributions; cf.
e.g. Walker and Lev, 1953, p. 461).
A lower 99 percent confidence limit estimator of the binomial parameter
6 is by definition any function of the observed outcome x, denoted con¬
veniently by 0{.99,x), which has the property of providing a correct lower
bound for the unknown value of 6 with probability at least .99, under each
possible 6; that is, any function 6(.99,x) of x satisfying

Prob(0(.99,x) ^ 6\8) ^ .99, for each 0, 05=02^1.

Application of such an estimator to the statistical evidence (£,40), for


example, might give the result 0(.99,4O) = .63 (as found below). Such
a lower confidence limit estimate is usually interpreted as a lower bound
on the unknown true value 6, whose correctness is supported by fairly strong
statistical evidence, indexed by .99. (An estimate is a fixed number, ob¬
tained by use of an estimator, a function of jc. ) Such an interpretation is
represented conveniently by writing Conf (6 ^ .63) ^ .99. (Such notation
has been used, for example, in the text of Walker and Lev, l.c. p. 54.)
The number .99 appears above as the lower bound of probabilities of
correctness (of bounds on 0, given by an estimator). Such a number is
called the confidence level, or the confidence coefficient,4 of the estimator
9(.99,x) and of an estimate such as 0(.99,4O) = .63. This probability
property is cited as basic in the usual explanations of the confidence con¬
cept, and in justifications of the use of confidence region estimators. It is
referred to as an operational property of the method. (The term “operating
characteristics” is used more or less generally to refer to the error-probabili¬
ties which are the basic terms of the theory of estimation due to Neyman and
the related theory of tests due to Neyman and Pearson. “Operational” here
refers to the realizability, in long sequences of comparable independent
applications, of observable relative frequencies approximating the prob¬
abilities mentioned.) These explanations usually include implicit or explicit
reference to the confidence concept, without which such estimates would
receive no evidential interpretation.
In our binomial examples, a convenient commonly used approximate
formula for such an estimator is
ALLAN BIRNBAUM 123

6(.99,x) =— - ^(.99) a/—(1 - — )/n =


n Vn n

jo-23W^-io)/50-
For (£,40) this gives #(.99,40) a .67. (<t> denotes the standard normal
X
distribution function.) Here — is of course the usual point estimator of a

binomial parameter, and the second term is a multiple of the usual estimator
of its standard error. A precise formula for such an estimator is #(.99,*) =
the smallest value # for which ^ f(v,6) 5= .99, where /(*,#) is the
vrgx

binomial pdf of our examples, #*(1 — #)50'L (The verification that

this estimator has the required basic property is immediate, and is described
conveniently in many texts by reference to the Clopper-Pearson charts.
Because of the discreteness of binomial distributions, no estimator (func¬
tion of x) can meet exactly the indicated bound on error-probabilities.)
Antecedents of both the method of confidence region estimation and the
confidence concept can be traced back as far as the earliest systematic
consideration of statistical evidence, that of Bernoulli, l.c. (Cf. also e.g.
Dempster, 1964, pp. 56-7).
As the example illustrates, the general method of confidence region
estimation is often applied by use of techniques provided by the theory of
point estimation, a self-contained branch of mathematical statistics. Point
estimates, accompanied by distribution theory adequate to indicate their
precision, yield confidence region estimates (often approximate ones) for
a wide variety of problems. Indeed, use of the confidence concept is cur¬
rently the favored mode of application and interpretation of point estima¬
tion methods. Thus confidence region estimation techniques share important
features with the long-standing practice of reporting measurements (esti¬
mates) of quantities accompanied by their standard errors (or other
indices of precision); and the confidence concept may be regarded as in¬
cluding usual interpretations of such reports of measurements. Further ex¬
pository remarks on these relations are given e.g. in Birnbaum (1961a).
The paper cited also describes briefly a mode of treating tests of
statistical hypotheses which is currently favored when feasable, namely,
embedding testing problems in confidence region estimation problems. To
illustrate in terms of our genetical example where # has just the two
possible values .75 and Vi, the confidence interval estimation result
Conf(.63 ^ # rg .94) fS .98 incorporates conveniently an application of a
test (between the hypotheses .5 and .75) and its interpretation (“reject
# = .5, at significance level stronger than 2%”).5
124 1. SCIENCE AND INQUIRY

In the modern general theory of estimation by confidence regions, at¬


tention is directed toward minimizing probabilities that confidence regions
will include values of 6 other than the true value, subject to the basic
defining condition (inclusion of the true value of 6 with specified prob¬
ability). Such minimization also tends to give small confidence regions, or
sharp confidence limits. Precise definitions of optimality or efficiency, in
terms of minimization of various error probabilities, play a key role in
this theory as the basis for choosing among the numerous possible esti¬
mators meeting the basic condition.
In our example the estimator <9(.99, x) is optimal, among all possible
99 percent lower confidence limit estimators, with one significant qualifi¬
cation: This estimator is not optimal if we allow consideration also of
estimators of the more general randomized form 6(.99,x,z), depending
not only on the binomial observation x but possibly also on an auxiliary
randomization variable z. The latter phrase denotes just a “random num¬
ber” z, i.e. an observation on a random variable with a uniform distribu¬
tion on the unit interval Of^z < 1. (Thus z is comparable to the augment¬
ing observation (heads or tails) in our discussion in the preceding section.
We may illustrate by discrete z: z = 0, .1, .2, . . . , or .9. Better improve¬
ment is obtained with a finer discrete distribution, z = 0, .01, .02,... or .99.
The limiting case, a continuous uniform distribution of z, gives fullest
improvement and also gives exactness in the sense of meeting the bound
on error-probabilities, for each 6.)
A convenient formula for an optimal randomized confidence limit esti¬
mator is
0(.99,x,z) = the value of 6 such that
2]/(v,0) + zf(x,6) = .99.
v<x
Such an estimator gives bounds on 6 always as sharp, and usually sharper
than, our estimator 6{.99,x). (It is correspondingly superior in minimiz¬
ing probabilities of bounds below any given false value below the
true value of 6.) In our binomial example (E,x), if we thus adjoin an
observed random number z = .6, we obtain 0(.99,4O,.6) = .65, larger
(sharper) than 0(.99,4O) = .63. This result with the new estimator is
represented by writing: Conf(0> .65) = .99.

2.3 Restriction of the Confidence Concept to Accommodate the


Sufficiency Concept

In the preceding example, observation of z = .2 instead of z = .6 would


have determined the result Conf (0^ .64) = .99 instead of Conf(0^ .65)
= The new interpretation of the evidence (£,40) from our binomial
experiment is clearly different from the old one: The smaller lower
bound for 6 represents a strictly weaker inference of conclusion (since 6
might have a value between the new and the old bounds), but the common
ALLAN BIRNBAUM 125

confidence coefficient indicates identical support for the stronger and


weaker inferences on the basis of the same evidence (E,40). Such a
feature has usually been judged inappropriate for any precise techniques
and general concepts for treating statistical evidence. Even the feature
of mere nonuniqueness of estimates 6(.99,40,z) based on given evidence
(E,40) has similarly been judged inappropriate.
It is just such judgments which are represented in the sufficiency con¬
cept (5) discussed above, since these cases of nonuniqueness of evidential
interpretations are traceable to just the dependence of the estimator on the
auxiliary randomization variable. The usual practice in applied statistics,
supported by a widely held theoretical judgment, is to restrict considera¬
tion to nonrandomized confidence region estimators. By thus sacrificing
(at least in discrete problems) a degree of efficiency and exactness in
terms of error-probabilities, applications of the confidence concept are
freed from incompatibility with the sufficiency concept, while they keep
validity of reference to the basic operational property of confidence region
estimators. (See for example the views of E. S. Pearson, coauthor of the
original charts of binomial confidence limits, and of the theory of tests
which is a technical basis for the theory of confidence regions, as indi¬
cated e.g. in Tukey, 1962, pp. 12-13.)

2.4 The Likelihood Concept


Another general concept of statistical evidence, the likelihood concept
(often called the likelihood principle), consists of two parts. One is an
axiom which resembles the sufficiency axiom and indeed implies the latter.
The second part specifies in more positive terms a mode of interpretation
of statistical evidence, and thus resembles the confidence concept in some
respects.
To formulate these, we require the definition of the likelihood function
(which plays important roles, technical and conceptual, in various areas
of mathematical statistics): For each model (E,j) of statistical evidence,
the function ptj of i, ie ft, is called the likelihood function. Here / is fixed.
More precisely, the function Pij is specified as one among many alterna¬
tive, equivalent representations of the same likelihood function, all having
the form cpi, where c denotes an arbitrary positive number.6
We discuss the axiom first:

(L): The likelihood axiom: If two models of statistical evidence (E,j) and
(E',j') determine the same likelihood function, then they represent the
same evidential meaning. That is, if for some positive c we have p^ =
cp'ij. for each i £ Q (the common parameter space of E and E'), then
Ev(E,j) = Ev(E',j') (where E = (Pij), E' = (p'ir)).

We note that when we take E' = E, this becomes just the sufficiency
axiom.7
The second part of the likelihood concept complements the likelihood
126 1. SCIENCE AND INQUIRY

axiom, by indicating to some extent how a likelihood function may be in¬


terpreted as a representation of evidential meaning. It consists of the
statement that the evidence supporting one parameter point i against an¬
other i' is represented just by the numerical value of the likelihood ratio
L(i,i') — Pa/Pi’j, with the value unity marking neutral evidence and suc¬
cessively larger values indicating stronger support for i against The
concept specifies no further structure or interpretation for the likelihood
ratio scale, nor any specific concept of “evidence supporting a set of
parameter points.”
Examples have played a considerable role in the explanation of the
concept, its intended scope, and the nature of its extramathematical mean¬
ing, as given by its principal proponents, Fisher (1925, 1956) and Barnard
(1947, 1949, 1962). Such examples illustrate but do not specifically com¬
plement the preceding general remarks. (We discuss below interpretations
dependent upon prior probability concepts.)
A suggestion of the plausibility of the likelihood concept may be ob¬
tained by noting that the likelihood function represents the relative prob¬
abilities, under respective possible parameter points, of the outcome ob¬
served. This is broadly analogous with the confidence concept, under
which any parameter point outside a confidence region estimate had, if it
be true, only probability .01 (say) of being so excluded, while each point
in the confidence region had, if it be true, probability .99 of being so
included. (It can even be argued that the confidence concept of evidence
itself takes intuitive conceptual plausibility from considerations like the
last remarks, and from some underlying version of the likelihood concept.
Cf. Barnard, 1949.) However the likelihood concept includes no reference
to an operational property resembling that of confidence region estimators;
and in fact it possesses no such property, as indicated in the next section.
Concerning the possible scope and fuller specification of the likeli¬
hood concept, Fisher (1956) considered it highly desirable to develop a
more complete and satisfactory concept than likelihood; and he proposed
his concept of fiducial probability as a more appropriate concept of statisti¬
cal evidence when technically applicable.8
Acceptance and application of the likelihood concept (apart from
Bayesian approaches) has been very limited among applied and theoretical
statisticians.9 Among the principal reasons, indicated by the preceding
paragraphs, undoubtedly the decisive one is the lack of any operational
interpretation comparable to that of the confidence concept.
The likelihood concept appeared as one of two intimately mingled
aspects of Fisher’s development of the maximum likelihood method of
estimation (1912, 1922, 1925). The second aspect was represented by
concepts based on error-probabilities (usually subsumed in large-sample
distribution and efficiency properties). The latter aspect of maximum
likelihood estimation theory is now clearly merged with Neyman’s theory
ALLAN BIRNBAUM 127

of estimation. Usual applications of maximum likelihood estimation are


accompanied by estimates of precision and interpreted by use of the con¬
fidence concept.
A major focus of development of mathematical statistics for several
decades was the complex challenge of appreciating these two aspects, and
of exploring the mathematical and conceptual sides of each, and their inter¬
relations. In particular, the initial step toward a general theory of testing
hypotheses by Neyman and Pearson (1928) was one of exploration and
descriptive unification based on tentative adoption of a version of the
likelihood concept. The latter was referred to later as a concept which
had been incompletely understood, and was abandoned in favor of error-
probabilities as basic terms (Neyman and Pearson, 1933). In Quetelet’s
(1845, 1849) influential expositions of the normal law of error, the likeli¬
hood function (referred to as the “scale of possibility,” and “relative
probability”) was displayed, and evidential interpretations of estimates
were supported by appeal (at least implicitly) to the likelihood concept
along with other concepts. The distinctness of the likelihood and other
concepts tend to be obscured in the context of the normal error model,
which dominated most statistical thinking until this century. Some earlier
more or less distinct appearances of the likelihood concept are discussed
by M. G. Kendall (1961) and this writer (1967).

2.5 Incompatibility of the Likelihood and Confidence Concepts


The likelihood axiom is quite incompatible with the confidence con¬
cept (and indeed with usual evidential interpretations of most standard
statistical techniques). For example, the lower 99 percent confidence limit
estimate of 6 in the binomial example (£,40) above was determined by
the formula 0(.99,4O) = the smallest value 6 such that

L (5°) 0V(1—0)5O—V — -99.


m^40 V
We observe that application of this method requires that (£,40) be given
in sufficient detail. In particular, the method cannot be applied if (£,40)
is represented only by the likelihood function 04o( 1 — O)10, for 0 0 5= 1,
as would be required for compatibility.10
The preceding comments show that usual evidential interpretations of
confidence region estimates are incompatible with the likelihood axiom.
Moreover the likelihood concept is evidently incompatible with any oper¬
ational property like that of confidence region estimators, as we illustrate
briefly by an example. (This example is artificial, but represents in a simple
discrete case some main features of examples which have been of interest,
cited below, involving normal distributions.) Let 6 = (yu.,o-): The values
a- = 0 or 100, respectively, represent the unknown precision, either very
128 1. SCIENCE AND INQUIRY

high or very low, of a single observation (measurement) x\ and an


integer, —1010 ^ ^ ^ 1010, is the unknown true value of a quantity to
be measured. Let

r 1, if a - 0 and x = /x
. 0, if a = 0 and x =,4 for all x,p;
t{X,0) - f(X,n,cr) - jc£100 _ |x _ if a = 100, for \x - /x| < 100,
0 otherwise.
V

Thus a = 0 gives an error-free measurement x = a — 100 gives a wide


“triangular” distribution of x centered at The constant c = 1/10,040
= 104 (to give total probability 1). The likelihood function determined by
any of the possible observations x, |x| < 108, is

f(x,fjL,(r) - 1, for <r = 0 and p = x\


= 100c = .01 2, for a = 100 and ^ = x;
< 100c = .01~2, for a = 100 and n x.

A simple application of the likelihood principle, in accord with the account


of the preceding section, is the following: The parameter point 6 =
(/.a,or) = (*,0), that is, /x = x and a = 0, has uniquely maximum likelihood.
(Its likelihood ratio, as against each other point in turn, exceeds unity.)
Thus it seems to be supported by the observed result, as against each and
all other parameter points. This includes, in particular, that the value a =
0 seems to be supported against the alternative a = 100. Evidently the
numerical values of the likelihood ratios referred to, all exceeding 100, rep¬
resent strong evidence in some sense. Evidently such interpretations of
statistical evidence must be regarded as misleading, and strongly mis¬
leading, in case the true value of a is not 0 but 100. But such misleading
interpretations will be suggested by the likelihood principle with prob¬
ability unity, if for example a = 100 and ^ = 0 are the true parameter
values, since in this case each possible outcome x determines a likelihood
function of the form considered. Thus it seems that the likelihood concept
cannot be construed so as to allow useful appraisal, and thereby possible
control, of probabilities of erroneous interpretations.
Broadly analogous examples and conclusions, concerned with the likeli¬
hood principle as a justification for maximum likelihood estimation and
also more generally, have been given by Neyman (1938), Armitage (in
Smith, 1961, pp. 32-34), and Stein (1962).
Curiously, the situation is quite different in the severely restricted case
of a parameter space of just two points, i = 1 or 2. (This case has practi¬
cal relevance, as our genetical example shows, although usually in practical
cases the parameter space is much larger [uncountably infinite].) For this
case, one may say that there does exist a concept of statistical evidence
having all of the desiderata considered in this paper, operational and
intuitive or axiomatic: Here a single number, the likelihood ratio L(l,2),
ALLAN BIRNBAUM 129

represents the likelihood function. According to the likelihood principle,


the value L( 1,2) = 100, for example, represents evidence of some posi¬
tive strength, indexed by the value 100, supporting i = 1 against i = 2.
These interpretations can be given the form of confidence region esti¬
mates: Choosing confidence level .99 and observing just L(l,2) determined
by an observed outcome of any experiment,

if L( 1,2) ^ 100, the confidence set estimate


is just the point i = 1,
if L(l,2) ^ .01,
the estimate is i = 2,
otherwise the confidence set is i - 1 or 2
(an uninformative estimate, reflecting weakness
of evidence).
The validity of such a confidence set estimator follows from the following
basic property of likelihood ratios: For each c > 1, the probability is at
most 1/c that L(l,2) ^ c, if i = 2. (Proof:

Prob(L(l,2) ^ c\i = 2) = 2 p2j (sum over /

11 1
for which L( 1,2) = plj/p2j ^ c, or p2j ^ - Plj) ^ - V Plj ^ -.)
c c ' c
In addition to the preceding conservative operational property, it is
easily shown that in experiments which would be called highly informative
under any plausible definition, the probability is high of “strong-looking
non-misleading evidence” in the sense of values L(l,2) far above unity if
i=l, and far below unity if i = 2.

2.6 The Conditionality Concept


Still another broad concept which is expressible as an axiom has played
significant roles here. This is the conditionality concept (or ancillarity con¬
cept). We indicate it first by reference to our binomial examples: Suppose
that the number of tosses of the coin (or the number of progeny to be bred
and observed for trait A, or the number of clinical patients to be observed)
is not certain to be 50, but will be either 50 or 200, depending upon certain
uncontrollable unpredictable conditions. We suppose that these conditions
are unrelated to the parameter 6 of interest, and that it is known to be equally
probable that the number will be 50 or 200. (This exemplifies a feature
found in many models applied in practice, although it is a feature which
is frequently not evident without mathematical analysis, as we indicate
below.)
If the number of observations turns out to 50, then the experimental
situation is represented by the binomial model £50 as before; and if it
turns out to be 200, then another binomial model, £200, will apply. But an
accurate and more complete model of the experimental situation is avail-
130 1. SCIENCE AND INQUIRY

able, within which the models £50 and £200 are embraced as possible
models, with probability V2 that each will turn out to be the one applicable.
The experimental situation represented by this broader model E will
result first in either E50 or £200 being realized and then in observation of
an outcome. Thus each sample point of E has the form (£5o,-*), with x =
0, 1, ... 50, or (£200,*) with x = 0,1, . . . 200. We note that here (£50,
x) plays not its usual role as a model of statistical evidence, although it is
that, but the role of a sample point of another experiment £. Hence a
model of statistical evidence represented by an outcome of £ has one of
the forms

(£,(£80,*)) or (£,(£20o,*) )•

The pdf’s of £ are readily determined. (For example,

1 50 50 — x
/((£5O,*),0) = -(x)0x(1 — 6) ■ )
Let us consider in this context £ the outcome (£.-)0,40), which represents
the statistical evidence (£,(£,->0,40)). The various concepts and techniques
of evidential interpretation discussed above, and others, are applicable here
in their usual general ways. For example an optimal 99 percent lower con¬
fidence limit estimator of 6 can be given as before (not withstanding greater
complexity of the definition of optimality and the calculations).
However it has seemed to many theoretical and applied statisticians that
a model of evidence such as (£,(£50,40)) here, comprehensive and ac¬
curate though it is, contains parts which are clearly irrelevant to evidential
meaning: given the bad luck of getting just 50 rather than 200 ob¬
servations, the evidential meaning of the outcome of the fifty observations
is fully represented by (£50,40); and the hypothetical possibility that an
additional 150 observations might have become available but in fact did
not, seems irrelevant to the evidential meaning of the result. Formally, this
is represented by adopting (£50,40) in place of (£,(£50,40)) as the ap¬
propriate model of the statistical evidence obtained (or as a more par¬
simonious, though equivalent model). Within this “conditional” model any
chosen mode of evidential interpretation could be applied. For example
the confidence limit estimate found for (£50,40) in section 2.2 above might
be determined and interpreted in the usual way.
The general form of the concept just illustrated may be stated generally
by reference to any model £ = (pi;) which is a mixture of two other models
£' = (p'ij-) and E" = (p"ir) in the sense illustrated, namely:

(i) The three models have common parameter space.

(ii) The sample joints / of £ are labeled alternatively, in any fixed


order, by the symbols (£'/), for ]’ = 1,2, ..., and (£"/'), for
f = 1,2,....
ALLAN BIRNBAUM 131

(iii)

Pii (
J(l/2 )p'ij;
1(1/2)p"ir,
if j=(E',n,
if/=. (£"/')•
(Mixtures are usually defined more generally, allowing more than two
component experiments, and possibly unequal weights; but this simple
restricted definition suffices for our discussion.) The general form of the
concept illustrated is11

(C): The conditionality axiom: If £ is a mixture of the experiments E'^fp'jj.)


and E"=(p"ir), then Ev(E,(E',j'))=Ev(E',jO.

The conditionality concept emerged in Fisher’s theory of estimation, a


little later than the sufficiency and likelihood concepts. In his last writings
on statistical inference, he emphasized both the importance he attached to
the concept and the incompleteness of knowledge of its mathematical and
extramathematical aspects. (“The most important step which has been
taken so far to complete the theory of estimation is the recognition of
Ancillary statistics” [1956, pp. 157-8].)
Other theoretical and applied statisticians have seen the conditionality
concept as an appropriate and even essential complement to the oper¬
ationally-based confidence concept. They have supported its application,
in cases illustrated by the example above, as supplying necessary content of
an intrinsically evidence-conceptual kind, and guidance for choices among
confidence region estimators. (Cox, 1958, pp. 359-63, Wallace, 1958, p.
864, and further references therein.) Against this background, the non¬
uniqueness of conditional models in certain problems (Basu, 1959, Birn-
baum, 19616) appeared initially as a significant but not insuperable problem,
to be met by further investigation of the scopes of the conditionality and
confidence concepts.

2.7 Incompatibility Between the Conditionality and Confidence Concepts


The general approach described in the preceding paragraph is one of
reliance on the confidence concept, with restrictions to accommodate the
conditionality as well as the sufficiency concept. Although this perspective
remains a preferred one among many applied and theoretical statisticians,
it has become clear that it must remain at best a rather eclectic one, be¬
cause it embraces conceptual ingredients fully as disparate as the likelihood
and confidence concepts, whose incompatibility was discussed above:
It has been seen that the conditionality and sufficiency axioms together are
equivalent to the likelihood axiom. This has suprised and disappointed
some, including this writer, who remain without an adequate precise
general concept of statistical evidence, and without even consistent criteria
for adequacy of such a concept.
The proof that (C) and (5) imply (L) is elementary: For any E' =
132 1. SCIENCE AND INQUIRY

(p'ij-) and E" = (p"ir) with common parameter space, let (E',j') and
(£",/") be any models of evidence determining a common likelihood func¬
tion: For some c > o, p"ir = c p'iy for each i. By applying (C) to the
mixture E of E' and E" we obtain

Ev(E'j') = Ev(E,(E',j') )
and
Ev(E",j") = Ev(E,(E",j”))

Now the likelihood functions determined by (£,(£',/)) and (£,(£",/"))


are, respectively, V2 p'ir and V2p"ir. These are proportional columns of E.
Therefore an application of (5) gives

£v(£,(£'/)) = £v(£, (£",/"))•


Hence
Ev(E',n = Ev(E",j"),
which is the conclusion of (L), completing the proof (given by this writer,
1962). (For another concept which many have found highly plausible,
and which along with (5) implies (L), see Pratt, 1961, 1962.)

3. STATISTICAL THEORIES OF DECISION-MAKING, RATIONAL


BEHAVIOR, AND INDUCTIVE BEHAVIOR

3.1 General Remarks

The historical roots of probability concepts were linked with those of


concepts of utility and of reasonable decision-making under uncertainty, in
the problems concerning betting odds in games of chance, which led to
early probability theory, and also in the earliest work concerned critically
with the interpretation of probabilities, that of James Bernoulli. Many
writers have since affirmed, and many denied, that other interpretations of
probabilities can and should be given, interpretations independent of utility
and decision concepts. In any case, the probabilities Pij appearing in
standard models of experiments have usually been given other inter¬
pretations, generally referred to as frequency interpretations.
It was in another role that utility and decision concepts re-entered
mathematical statistics in recent decades, notably with Neyman (e.g. 1957)
and Wald (1950), and played a major role. (The concepts had also ap¬
peared clearly in this role in the introductions by Laplace and Gauss of the
estimation criteria of mean-absolute-error and mean-square-error. But as
the normal error model came to receive almost exclusive attention in estima¬
tion theory, that model’s precision parameter assimilated the technical role
of such criteria, and their specific conceptual content was neglected. New
technical and conceptual problems appeared with increasing consideration
ALLAN BIRNBAUM 133

of other models, and also of the normal model with precise attention to the
imperfect estimate of its precision parameter.)
This role of utility and decision concepts might be described as connected
with interpretations and applications of outcomes j of given experiments
E = (Pij). But more precisely, this role is not related to interpretations of
cases of statistical evidence (E,j), nor to concepts of evidence, in the gen¬
eral sense of this paper. (This point has not always been clearly appreciated.
Some sources of misunderstanding are indicated below.) Rather, this
role is related primarily to another aspect of the experimental situation,
represented by another mathematical ingredient which complements the
model E of an experiment, when statistical decision problems are formu¬
lated. This is the decision space (illustrated briefly in section 1 above), a
set D = {d} of points (labels) d of the respective alternative decisions or
actions, one of which is to be adopted on the basis of the observed out¬
come j of E. An artificially simplified but relevant illustration can be devised
in connection with the genetical example above: Let d1 denote the decision
to launch a commercial stock-breeding program, or alternatively to launch
a program of further genetical research, in a way which would be ad¬
vantageous and appropriate if and only if two specified genes lie on different
chromosomes. Suppose d2 denotes the sole alternative possible decision, for
example to launch a different program, or to do nothing. The problematic
situation represented in part by the model (E,D) is resolved when a decision,
d-i or d2, is adopted.
In statistical decision theory, each possible policy, or rule for choice of
decisions, is represented by a decision function d(j) with values in D (pos¬
sibly randomized, i.e., depending also upon an auxiliary randomization
variable). The theory develops concepts and techniques of advantageous
choice, based on the utility concepts of loss and risk or utility, and on
criteria for optimal choice (admissibility, minimax, etc.).
Statistical decision theory has played significant roles in several devel¬
opments in modern mathematical statistics. One of these developments is a
new approach to prior probability concepts, due to Savage (1954): The
admissibility criterion for reasonable choice is a weak and basic, noncon-
troversial one in the context of statistical decision problems. It is of some
intuitive as well as technical interest that this criterion admits accurate
formal interpretation in terms of hypothetical implicitly-held prior prob¬
abilities: Each admissible decision function would be also an optimal one
for a person holding a suitably corresponding prior probability distribution.
(Here an extended technical sense of prior pdf is sometimes required, a
sense generally favored also by those holding prior probability concepts.)
Savage showed that certain richer criteria, characterizing also consistency
among all preferences in such contexts, entail prior probabilities and linked
utilities, at least hypothetical implicitly-held ones, which would account for
134 1. SCIENCE AND INQUIRY

any specified consistent preference pattern. (Savage is also a notable modern


exponent of the view that probabilities in general, not only prior prob¬
abilities, are essentially linked with utilities. But this issue is rather in¬
dependent of the result mentioned.)
Another important role of statistical decision theory has been based on
its introduction of new mathematical approaches by which previously
established branches of mathematical statistics have been appreciably ex¬
tended, notably the theories of testing hypotheses and estimation due to
Neyman and Pearson, and of point estimation. In this connection the loss
and utility concepts of decision theory play no distinctive role, since error-
probabilities remain the basic terms of the theories thus extended. Here the
specification of the decision space is chosen to represent one of the standard
statistical problems mentioned, and not to represent more concrete or
practical alternative decisions faced by an investigator using a test or
estimate. As an example of testing, in the genetics experiment, “rejection
or non-rejection of the hypothesis” that the genes lie on different chro¬
mosomes would be called the alternative “decisions.” (The loss and utility
or risk functions for testing, and for confidence interval estimation which
is linked technically with testing, are here simply indicators of errors and
of probabilities of errors, not measures of more specific practical losses and
utilities which might be entailed by errors. [Cf. e.g. Birnbaum, 1961a.]
For point estimation the loss function may be squared error, linked to the
traditional variance criterion and precision concept, or absolute error, or
any other function specified as an appropriate measure of the loss entailed
by a given error of estimation. The case of squared error has been most
fully treated, for reasons partly technical and no doubt partly intuitive or
traditional. The role of general loss functions in such formal point estimation
theory is correlated with the limited application of point estimates except
when accompanied by an estimate of precision [not loss].)
Now concepts of statistical evidence, including those discussed above,
played essential roles at various stages in the development of theories and
applications of estimation and testing hypotheses, notwithstanding the
perennially problematic character of such concepts. But in recent decades
statistical decision theory has proved eminently successful in providing in¬
creased mathematical scope, and also one kind of conceptual unity, for
these theories. These two circumstances have been interpreted by many
(but far from all) as showing that the concepts of decision and utility
introduced by decision theory have provided appropriate or successful ex¬
plications of the previously troublesome and obscure concepts of statistical
evidence; and/or that they have provided an appropriate way of circum¬
venting questions concerning such concepts.
Such interpretations are illusory and unfortunate. Concepts of statistical
evidence can hardly be circumvented in general, since they are involved
ALLAN BIRNBAUM 135

explicitly, inextricably and irreducibly, in the actual structures and processes


of scientific disciplines such as genetics where the essential decision-theoretic
concepts, utility and decision, have little or no clearly relevant role.
The point concerning decisions in particular has been expressed notably
by Cox (1958, p. 354): “It might be argued that in making an inference
we are ‘deciding’ to make a statement of a certain type about the popula¬
tions and that, therefore, provided the word decision is not interpreted too
narrowly, the study of statistical decisions embraces that of inferences. The
point here is that one of the main general problems of statistical inference
consists in deciding what types of statement can usefully be made and
exactly what they mean. In statistical decision theory, on the other hand,
the possible decisions are considered as already specified.”

3.2 The Compounding Approach


Important extensions of decision theory are provided by the compound¬
ing approach, in which decision problems which might be treated separately
are treated jointly in ways which yield appreciable gains in terms of error
frequencies or utilities. (Neyman, 1962, and Robbins, 1963, give non¬
technical accounts.) A particularly striking feature of the compounding
approach is that these gains are available even when the respective prob¬
lems lie in arbitrarily related or unrelated empirical research fields. This
feature has been appreciated and endorsed by the originator of the approach
(Robbins, l.c., p. 112). (The problems might lie respectively in agronomy
and astronomy; and/or in eugenics, euthenics, eumetrics, and hermetics.
Thus the approach may be described as opening wider domains over which
utility is to be gained, by the compounding of errors.)
Now the notion is widely and firmly held that errors or apparent errors
in one research field are irrelevant to evidence in another field of in¬
vestigation judged unrelated to the first.12
For those who see significant scope for this notion, the compounding
approach provides an inadvertant but striking reductio ad absurdum of the
view that decision theory provides the appropriate ways of interpreting and
using statistical evidence in general.

4. PRIOR PROBABILITY CONCEPTS

4.1 Technical Introduction


A prior probability model of an experiment, (G,E), is constituted by a
model of an experiment E = (p^) as above, complemented by a mathe¬
matical probability distribution over the parameter space O, denoted by G
and represented by a pdf gi = Prob(i), called the prior (or a priori) dis¬
tribution. Correspondingly, a prior probability model of statistical evidence,
136 1. SCIENCE AND INQUIRY

(G,E,j), is a model (£,/') of evidence as above, complemented by a prior


distribution G.
Now (G,E) is a probability model, as distinct from a statistical model,
of an experiment, in our usage introduced above. That is, it represents just
one, not two or more, probability distributions on a given sample space.
Its sample space is not S = {/} (the sample space of E), but the set of
points (/,/'), for ( e fi and j e S. (We may denote that space by ClxS =
{(U)}).
Correspondingly, (G,E,j) is not a statistical model of an experiment
(although it includes such a model (£,/) as an ingredient), and the con¬
cepts of statistical evidence discussed above are not even formally appli¬
cable. Moreover, the observed outcome j here is not an elementary outcome
(sample point) of the probability model (G,E) but represents the set of
sample points (/,/), /eO. Finally, the model (G,E) is given in a form which
specifies some conditional distributions: Pn,Pi2, ■ ■ Pa, • ■ ■ Pu, is now the

conditional pdf of j, given i.


From (G,E,j), a standard elementary formula (Bayes’ theorem, not
Bayes’ principle or postulate) gives the conditional probability of i, given

g*i = Prob(i|/) = kgipy, for each i e Q,

where k is a normalizing constant (1/k = 2 giPa) • This represents a new


i’eO
mathematical probability distribution G* over O, called the posterior (or a
posteriori) distribution.
In our genetical example, a mathematically possible prior distribution G
is the uniform one, which assigns equal probabilities gi — g2 = Vi to the two
parameter points i = 1,2. The latter labels indicate respectively the param¬
eter values 6 = Vi and 14 in the binomial distributions in that model. In a
more general genetical model, allowing for breaking and rearranging of
chromosomes, the parameter space becomes Va =4 0 =4 Vi, and the model of
our experiment with 50 observations is once again constituted by the cor¬
responding set of binomial distributions. On this parameter space also,
various possible mathematical probability distributions G may be defined.

4.2 Relation to the Likelihood Concept


We may note that in this context of prior probability models, a precise
version of the likelihood principle is evident: The preceding formula shows
that the posterior probabilities determined by (G,E,j) depend upon the
statistical evidence (£,/') only through its likelihood function cp^ (since the
latter, multiplied by gi, gives g*i up to a constant, and the constant is de¬
termined as above); and relative posterior probabilities
ALLAN BIRNBAUM 137

Q.

differ from corresponding relative prior probabilities — only by the factor


Si'
L{i,i') = —the likelihood ratio.
Pi'}
Of course this observation does not provide interpretations or justifica¬
tions of the likelihood concept outside of prior probability contexts, except
by way of broad and loose analogy. On the other hand, any support for
the likelihood principle on grounds independent of prior probability con¬
cepts may be regarded as also partial support for the latter concepts (since
it diminishes the scope of questions of extramathematical interpretation and
justification of prior probability concepts). Such possible independent
grounds include self-evidence of the likelihood concept; or its entailment
by other concepts which may be considered plausible, such as (5) and (C)
as above.

4.3 Interpretations of Prior Probability Concepts


Of course prior opinions and judgments, based on previously available
observations and theory, play important broad roles in investigations in all
fields. What many investigators and theoretical statisticians have not found
is an adequate account of the appropriateness, the meaning, and the use¬
fulness of the specified forms of prior probability concepts and methods, in
relation to the background of a specific investigation. (Cf. e.g. Nagel,
1939.) (Of course prior probability concepts are not directly supported by
the difficulties which may be met in other approaches to explication of
probabilities such as the py’s; not, in particular, by the objections which have
been raised to proposed “relative frequency interpretations” of such prob¬
abilities, as in Keynes’ discussion of the structure of Darwin’s theory
[1921, pp. 108-9].)
Poincare’s view of the empirical sciences as “but unconscious applications
of the calculus of probability,” without which “Science would be impos¬
sible”, has been endorsed by one of the foremost modern exponents of
prior probability concepts, de Finetti (1937; pp. 99, 156 of the translation
cited below). Here “Science” evidently refers to the actual structures and
developments of the sciences, as these might be characterized by general
principles amounting to a “logic of science” (including, in the case of
de Finetti’s work, a kind of solution of Hume’s general problem of induc¬
tion). The rich literature on prior probability concepts attests abundantly
to this broad observation: If there is a “logic of science,” in any one of many
proposed precise senses, then that “logic” will include formal prior prob¬
abilities. It is the actual or potential existence of science, or of a logic of
science, in such a sense which has been proposed by some and questioned
by others for two hundred years.
During recent years, prior distributions have been introduced and used
systematically by some investigators of genetical linkage (Haldane and
138 1. SCIENCE AND INQUIRY

Smith, 1947, pp. 12-13; Morton, 1955, pp. 281-4, and 1962, pp. 38-43;
Smith, 1959, pp. 298-9, and 1963, and references therein; and Renwick
and Lawler, 1963, pp. 69-71, 84). In another area, a problem in at¬
tribution of authorship has been investigated by both Bayesian and non-
Bayesian methods, described in detail, by Mosteller and Wallace (1964).
It is interesting to compare and contrast such applications of prior prob¬
ability concepts with examples which figured in earlier stages of discussions
of such concepts. The latter include Karl Pearson’s and Fisher’s discussions
of anthropometric correlation measurements (Soper et al., 1916; Fisher,
1921).
A discussion of such comparisions will be presented elsewhere, together
with fuller discussion of other points touched on briefly in this paper.

NOTES
1. The writer is indebted for helpful comments on earlier versions to John Pratt,
Charles Fisher, and Valerie Mike, and to Churchill Eisenhart for guidance on histori¬
cal material. This work was done in part while the author was a Fellow of the John
Simon Guggenheim Foundation. It was also supported in part by the Office of Naval
Research and the National Science Foundation.
2. The general form of such determination of one model E from another E',
by any given function /'(/') (not necessarily having the property considered above),
is: E has as its sample space S — {/} the range of values of /(/'), for f in S’; without
loss of generality for our purposes, we assume this range to have (after possible
relabeling) the form /= 1,2, . . . J. The probabilities in E — (pu) are given by the
usual basic probability rule, p(j = 2p'(j' where the summation is over all j' such
that ;■(/') = /.
3. The condition given above is well known to be equivalent to the usual de¬
fining condition for a sufficient statistic. (Cf. e.g. Cramer, 1946, p. 488, Lehmann,
1959, Lindgren, 1962.) Since we shall not use the latter condition, it is convenient
to regard the condition in (S) as defining sufficient statistics in our discussion.
4. It is of course common nontechnical usage to call any proposition probable
or likely if it is supported by strong evidence of some kind. Thus it would not be
unnatural here to call the confidence coefficient .99 the probability or the likelihood
of the proposition that 8 S: .63. However such usage is to be avoided as misleading
in this problem-area, because each of the terms probability, likelihood, and con¬
fidence coefficient is given a distinct mathematical and extramathematical usage here.
That of likelihood is introduced in the next section. Probabilities are specified in
our models E and (E,x) only for points and subsets of the sample space S = {*}.
For points 8 or sets (like 8 2: .63) in the parameter space Q = (0), no probabilities
are defined, except under specific additional formulations (such as the Bayesian, in
which the parameter space is made the sample space of specified prior and posterior
distributions).
5. The concept of statistical evidence which has been associated with tests since
their appearance in 1710, which may be termed the significance-probability concept,
has proved an elusive one to explicate (cf. Anscombe, 1963). The paper in which
the Neyman-Pearson theory of tests originated was an approach to the problem of
explicating and justifying this concept in connection with a systematic mathematical
theory of tests. The paper introduced a new definition of a test as a two-decision
procedure, whose operational error-probability properties allow application of the
confidence concept as it applies to results of such tests. (Cf. especially p. 336 of
Neyman and Pearson, 1933.)
6. When convenient the last form may be used, or the symbol c may be replaced
by any positive number. Thus the only essential aspect of a likelihood function is
the set of likelihood ratios it determines; these are the respective values of the
likelihood ratio function:
L(i,i’) = Pn/pvi, for i,i’ e 0.
7. We observe that the definition of a sufficient statistic and the statements of
ALLAN BIRNBAUM 139

(Si) and (S) above can be stated conveniently in terms of the likelihood function.
In fact, the likelihood function is readily seen to be the sufficient statistic which
gives the greatest possible simplification of models in the sense illustrated above; the
latter property makes it a “minimal sufficient” (or “necessary and sufficient”)
statistic.
8. Fisher’s program for a theory of fiducial probability shows that he considered
that a fully adequate concept of statistical evidence would be one having the form
of probability distributions over parameter spaces, determined however in such a
way as to be incompatible with the likelihood axiom (cf. Anscombe, 1957). The
fiducial concept is not discussed further here because the present writer (among
others) has found insufficient clear substance in the mathematical and extra mathe¬
matical aspects of that concept. For some recent criticisms, reinterpretations, and/or
revisions, see Fraser, 1961, Dempster, 1964, 1966, and Hacking, 1965.
9. But a semblance of a “likelihood school” has been suggested in recent years
by, for example, Barnard et al., 1962, Anscombe, 1964, D. G. Kendall, 1965, Sprott
and Kalbfleisch, 1965, Zweifel, 1966, and some tentative discussion by this writer,
1961, 1962, pp. 286-98, 325-6.
10. We indicate here in further detail the incompatibility referred to: The given
likelihood function could have arisen alternatively from a different experimental
procedure, E*, in which the newly-bent coin is tossed until just 40 heads have been
observed. If the fortieth head is observed after just 10 tails have appeared, the likeli¬
hood function is the same. But the model of this experiment is quite different from
E. The method of confidence limit estimation is applicable in E* in the same general
way as in E. But in general the (optimal nonrandomized) .99 lower confidence limit
determined from (E*,x*) and (E,x) are different, even when the latter deter¬
mine the same likelihood function. Furthermore, there are still other experiments in
which the same likelihood function may appear, but where the method of confidence
limit estimation is inapplicable, or at best applicable only awkwardly and formalis-
tically. An example is represented by the model with just two sample points, j — 1,2,
and the pdf /(1,<?) = ^ (1-0)10, f(2,8) = 1-/(1,d), for 0 ^ 0 g 1.
11. The formulation of the conditionality concept as one of equivalence, as in
(C) preceding, was proposed by this writer (1962) as the natural explication of the
concept, not withstanding the one-sided form to which applications of the concept
had been restricted (substitution of simpler for less simple models of evidence).
This proposal seems to have found general acceptance among those interested in
the concept. A restricted formulation of the concept proposed by Basu (1964) is
admittedly a modification of Fisher’s idea (requiring introduction of another extra-
mathematical concept), and in any case does not lead to another.
12. It is of course not often that explicit reference to this notion is called for.
It could be expressed rather precisely as an axiom for concepts of evidence, analogous
to those above.

REFERENCES

Anscombe, F. J. 1957. Dependence of the fiducial argument on the sam¬


pling rule. Biometrika, 44:464—69.
Anscombe, F. J. 1963. Tests of goodness of fit. Journal of the Royal Statis¬
tical Society, (B) 25:81-94.
Anscombe, F. J. 1964. Normal likelihood functions. Annals of the Institute
of Statistical Mathematics, (Tokyo) 16:1—17.
Arbuthnot, John. 1710. An Argument for Divine Providence taken from
the constant Regularity observed in the Births of both Sexes. Philo¬
sophical Transactions of the Royal Society, 27:186-90.
Armitage, Peter. 1961. Discussion in Smith, 1961, pp. 30-31, Journal of
the Royal Statistical Society (B) 23:1-37.
Bailey, N. T. J. 1961. Introduction to the Mathematical Theory of Ge-
netical Linkage. London: Oxford U. Press.
140 1. SCIENCE AND INQUIRY

Barnard, G. A. 1947. Review of Sequential Analysis by A. Wald. Journal


of the American Statistical Association 42:658—64.
Barnard, G. A. 1949. Statistical inference. Journal of the Royal Statistical
Society, 11:115-39.
Barnard, G. A., Jenkins, G. M., and Winsten, C. B. 1962. Likelihood in¬
ference and time series. Journal of the Royal Statistical Society, 125.
321-27.
Basu, D. 1959. The family of ancillary statistics. Sankhya, 21:247-56.
Basu, D. 1964. Recovery of ancillary information. Sankhya, 26A.
Bayes, Thomas. 1763. An essay towards solving a problem in the doctrine
of chances. The Philosophical Transactions of the Royal Society, 53:
370-418.
Bernoulli, Jacques. 1713. Ars Conjectandi. Basel.
Birnbaum, A. 1961a. Confidence curves: An omnibus technique for esti¬
mation and testing statistical hypotheses. Journal of the American Statis¬
tical Association, 56:246—49.
Birnbaum, A. 1961b. On the foundations of statistical inference. I. Binary
experiments. Annals of Mathematical Statistics, 32:414—35.
Birnbaum, A. 1962. On the foundations of statistical inference (with dis¬
cussion). Journal of the American Statistical Association, 57:269—326.
Birnbaum, A. 1967. John Arbuthnot. The American Statistician 21:23-25,
27-29.
Birnbaum, A. 1968. Likelihood. International Encyclopedia of the Social
Sciences.
Boole, G. 1854. Investigations of the Laws of Thought on Which Are
Founded the Mathematical Theories of Logic and Probabilities. London.
Clopper, C. J. and Pearson, E. S. 1934. The use of confidence or fiducial
limits illustrated in the case of the binomial. Biometrika, 26:404-413.
Cox, D. R. 1958. Some problems connected with statistical inference.
Annals of Mathematical Statistics, 29:357—372.
Cramer, Harald. 1946. Mathematical Methods of Statistics. Princeton.
de Finetti, Bruno. 1937. Foresight: its logical laws, its subjective sources.
Annales de I’lnstitut Henri Poincare, 7. Translation in Studies in Sub¬
jective Probability, Henry E. Kyburg and Howard E. Smokier, eds. N.Y.:
Wiley.
Dempster, A. P. 1964. On the difficulties inherent in Fisher’s fiducial argu¬
ment. Journal of the American Statistical Association, 59:56-66.
Dempster, A. P. 1966. Review of Ian Hacking’s Logic of Statistical Infer¬
ence. Journal of the American Statistical Association, 61:1233—5.
Feller, William. 1968, 1966. An Introduction to Probability Theory and
Its Applications, Vols. 1 (3rd ed.), 2. N.Y.: Wiley.
Fisher, R. A. 1912. On an absolute criterion for fitting frequency curves.
Messenger of Mathematics, 41:155.
Fisher, R. A. 1920. A mathematical examination of the methods of deter-
ALLAN BIRNBAUM 141

mining the accuracy of an observation by the mean square error.


Monthly Notices of the Royal Astronomical Society, 80:758-70.
Fisher, R. A. 1921. On the “probable error” of a coefficient of correlation
deduced from a small sample. Metron, 1:3-32.
Fisher, R. A. 1922. On the mathematical foundations of theoretical statis¬
tics. Philosophical Transactions of the Royal Society of London, Series
A, 222:309-368.
Fisher, R. A. 1925. Theory of statistical estimation. Proceedings of the
Cambridge Philosophical Society, 22:700-25.
Fisher, R. A. 1950. Contributions to Mathematical Statistics. N.Y.: Wiley.
Fisher, R. A. 1956. Statistical Methods and Scientific Inference. Edinburgh:
Oliver and Boyd.
Fraser, D. A. S. 1961. On Fiducial inference. Annals of Mathematical
Statistics, 32:661-76.
Hacking, I. M. 1965. The Logic of Statistical Inference. Cambridge U.
Press.
Haldane, J. B. S., and C. A. B. Smith. 1947. A new estimate of the linkage
between the genes of color-blindness and haemophilia in man. Annals
of Eugenics, 14:10-31.
Jeffreys, Harold. 1961. Theory of Probability, 3rd ed. London: Oxford
U. Press.
Kendall, David G. 1965. A statistical approach to Flinders Petrie’s se¬
quence-dating. Bulletin of The International Statistical Institute, 657-80.
Kendall, M. G. 1961. Daniel Bernoulli on maximum likelihood. Bio-
metrika, 48:1-28.
Keynes, John Maynard. 1921. A Treatise on Probability. London: Mac¬
millan.
Laplace, Pierre. 1774. Memoire sur la probabilite des causes pour les
evenements. Memoire de Mathematique et de Physique, Academie Royal
des Sciences, Vol. IV.
Laplace, Pierre. 1820. Theorie analytique des probabilites, 2nd ed. Paris:
Courcier.
Lehmann, E. L. 1959. Testing Statistical Hypotheses. N.Y.: Wiley.
Lindgren, B. W. 1962. Statistical Theory. N.Y.: Macmillan.
Morton, Newton E. 1955. Sequential tests for the detection of linkage.
American Journal of Human Genetics, 7:277-318.
Morton, Newton E. 1962. Segregation and linkage. Pp. 17-52 in Meth¬
odology in Human Genetics, ed. W. J. Burdette. San Francisco: Holden-
Day.
Mosteller, Frederick, and David L. Wallace. 1964. Inference and Disputed
Authorship: The Federalist. Reading, Mass.: Addison-Wesley.
Nagel, Ernest. 1939. Principles of the Theory of Probability. University of
Chicago Press.
Neyman, Jerzy. 1937. Outline of a theory of statistical estimation based
142 1. SCIENCE AND INQUIRY

on the classical theory of probability. Philos. Trans. Roy. Soc. (A),


236:333-80.
Neyman, Jerzy. 1938. (2nd ed., 1952). Lectures and Conferences on
Mathematical Statistics. Washington, D.C. Graduate School, U.S. Dept,
of Agriculture.
Neyman, Jerzy. 1947. Raisonnement inductif ou comportement inductif.
Proceedings of the International Statistical Conference, 3.423—33.
Neyman, Jerzy. 1957. Inductive behavior as a basic concept of philosophy
of science. Review of the International Statistical Institute, 25:7—22.
Neyman, Jerzy. 1962. Two breakthroughs in the theory of statistical
decision-making. Review of the International Statistical Institute, 30.
11-27.
Neyman, Jerzy and Pearson, E. S. 1928. On the use and interpretation of
certain test criteria for purposes of statistical inference. Biometrika,
20A: 175-240, 263-94.
Neyman, Jerzy and Pearson, E. S. 1933. On the problem of the most effi¬
cient tests of statistical hypotheses. Philosophical Transactions of the
Royal Society of London (A) 231:289-337.
Pratt, John W. 1961. Review of Testing Statistical Hypotheses. Journal of
the American Statistical Association, 56:163-7.
Pratt, John W. 1962. Comments on A. Birnbaum’s “On the foundations of
statistical inference,” Journal of the American Statistical Association, 57:
314-5.
Quetelet, Adolphe. 1845. Sue l’appreciation des documents statistiques, et
en particulier sur l’appreciation des moyennes. Bulletin de la Commis¬
sion Centrale de Statistique, 2:205—286.
Quetelet, Adolphe. 1849. Letters on the Theory of Probability. Translated
by Downes. London.
Renwick, J. H., and Sylvia D. Lawler. 1963. Probable linkage between a
congenital cataract locus and the Duffy blood group locus. Annals of
Human Genetics, 27:67-76.
Robbins, Herbert. 1963. A new approach to a classical statistical decision
problem (with discussion). Pp. 101—13 in Induction: Some Current
Issues, H. Kyberg and E. Nagel, eds. Wesleyan U. Press.
Savage, Leonard J. 1954. The Foundations of Statistics. N.Y.: Wiley.
Smith, C. A. B. 1959. Some comments on the statistical methods used in
linkage investigations. American Journal of Human Genetics, 11:289—
304.
Smith, C. A. B. 1961. Consistency in statistical inference and decision
(with discussion). Journal of the Royal Statistical Society (B), 23:1-37.
Smith, C. A. B. 1963. Testing for heterogeneity of recombination fraction
values in Human Genetics. Annals of Human Genetics, 27:175-182.
Soper, H. E., Young, A. W., Cave, B. M., Lee, A., and Pearson, K. 1916.
ALLAN BIRNBAUM 143

On the distribution of the correlation coefficient in small samples. Bio¬


met rika, 11:328-413.
Sprott, D. A., and Kalbfleisch, J. G. 1965. Use of the likelihood function
in inference. Psychological Bulletin, 64:15-22.
Stein, C. 1962. A remark on the likelihood principle. Journal of the Royal
Statistical Society (A), 125:565-573.
Tukey, John W. 1960. A survey of sampling from contaminated distri¬
butions. 39:448-485 in Contributions to Probability and Statistics, ed.
I. Olkin et al. Stanford U. Press.
Tukey, John W. 1962. The future of data analysis. Annals of Mathematical
Statistics, 33:1-67.
Venn, John. 1888. The Logic of Chance, 3rd ed. London.
Wald, Abraham. 1942. On the Principles of Statistical Inference. Notre
Dame, Ind.: Notre Dame.
Wald, Abraham. 1950. Statistical Decision Functions. N.Y.: Wiley.
Walker, Helen, and Joseph Lev. 1953. Statistical Inference. N.Y.: Holt.
Wallace, D. L. 1958. Intersection region confidence procedures with an
application to the location of the maximum in quadratic regression.
Annals of Mathematical Statistics, 29:455—475.
Zweifel, James R. 1966. Use of the likelihood principle for the determina¬
tion of carcinogenic activity in pulmonary tumor assays. Journal of the
National Cancer Institute, 36:937-946.
SOME HALF-BAKED THOUGHTS ABOUT
INDUCTION1
Max Black

0 “Induction” stands here for any kind of nondemonstrative argument


whose conclusion is not intended to follow from the premises by sheer
logical necessity.
0.1 The negation of an inductive conclusion is compatible with the
amalgamated premise (the conjunction of all the reasons offered in sup¬
port of the conclusion). As Peirce said, induction, in this broad sense, is
“ampliative”—“the facts summed up in the conclusion are not among those
stated in the premises” (Papers, 2.680).
0.2 It would be useful to have some short label for “nondemonstrative
argument,” to avoid confusion with generalization from particulars. My
own candidate is “adduction,” which contrasts neatly with “deduction.”
0.3 Philosophers need to worry about adduction in general, and not
especially with that primitive variety of it known as induction by simple
enumeration.

1 There is such a thing as induction.


1.01 Induction, pace Popper, is not an illusion, a chimera, or a will-of-
the-wisp. Nor is it an invention of simpleminded philosophers of science.
1.02 Laymen, and scientists too, constantly offer as sufficient reasons for
empirical conclusions the truth of propositions that do not entail those con¬
clusions. In so doing, they think they are reasoning soundly. It is incon¬
ceivable that they should always, and necessarily, be mistaken.
1.03 Once in a while, we can prove an empirical conclusion by invoking
empirical laws and observations. But in the absence of proof, we are not
reduced to twiddling our thumbs. (The Chinese speak of riding an ox to
find a horse. But the ox moves, too.)
1.1 Induction is an instrumental notion: Inductions must be good for
something. If there is such a thing as induction, we must achieve some¬
thing by sound inductive reasoning. But what? Why not: arriving, if all
goes well, at new and true propositions, by means of a defensible pro¬
cedure? (But how to defend the procedure? There’s the rub.)
1.11 An inductive reasoner is like an archer: He wants to aim well but
in order to hit the target. There is no point in aiming well, if it does not
help you to hit the mark.
144
MAX BLACK 145

1.2 We ought to resist the persistent inclination of philosophers of in¬


duction to treat all inductions as deductive enthymemes—and not merely
because this leads to wholesale skepticism.
1.21 The Drang nach Beweis can be made immensely plausible, for
there is usually relevant background information that can be introduced
to strengthen” the argument. (You think there is a man in that car be¬
cause you see it moving at high speed? But you would not think so unless
you knew a great deal about cars and men. So why not write it down?)
1.211 Does the notion of a completely explicit argument always make
sense?
1.3 When inductions are treated as partially inexplicit deductions, they
must all be rejected as invalid, epistemically circular, or irrelevant.
1.31 An induction, the conjunction of whose premises is P and conclusion
K, can easily be rendered valid by supplying the additional premise, If P
then K. But when P expresses the reasoner’s complete reasons for affirming
K (so that the original induction is fully explicit), no further reason is
available, at the time at least, in support of the additional premise, If P
then K. Reconstructed as a deductive argument, it has the defect of using
a premise that its author has no good reason to think true.
1.311 In fact, the additional premise may very well be false.
1.312 If the additional premise is so weak that If P then K does not
follow from it, in conjunction with P, the reconstructed argument, considered
as a deduction, must be invalid.
1.313 If If P then K is offered as a reason for passing, inductively, from
P to K, the circularity is patent.
1.314 But there is nothing to stop us from looking for additional in¬
formation which might allow If P then K, or something stronger, to be
established.
1.315 The foregoing strictures will apply, with even greater force, to any
additional premise that is stronger than If P then K.
1.3151 The famous principles of Uniformity of Nature, Limitation of
Variety, and the like are too strong to be established and too weak to imply
the conditional link, If P then K.
1.31511 Furthermore, the chances are that one and all of them are false.
1.32 Those who canot conceive induction to be legitimate sometimes try
to weaken the conclusion by inserting a reference to probability.
1.321 On certain “logical” theories of probability (e.g., Carnap’s) a
weakened conclusion, K say, does follow strictly from P. But then such an
argument fails to make the “inductive leap” and is therefore irrelevant to
the purpose of induction.
1.3211 What is the use of telling me something that follows strictly
from a proposition expressing my empirical evidence when I want to know
something about what is not entailed by that proposition—something about
the not-as-yet-verified? It would be more straightforward—more candid—
146 1. SCIENCE AND INQUIRY

to say that my purpose was misguided and impossible of achievement.


(This way lies inductive skepticism.)
1.4 According to the hidden-assumption-approach, inductive inference
is good for nothing because it cannot serve the imputed purpose of being a
partially explicit but sound deductive argument. According to the prob¬
ability-seeking-approach, inductive inference has not been shown good for
anything because all the reputable work is accomplished in the calculation
of the probability. One approach makes induction useless, the other makes
it otiose.

2 Some inductive arguments are better than others and some are very
good indeed.
2.1 Anybody who rejects all inductive arguments, indiscriminately, as
“invalid” seems committed to holding that if I hear the sound “ku-ku”
from a tree, it is no more reasonable to think there is a cuckoo in the tree
than to think there is a lion there; and no more reasonable to expect to
find a pebble on the sidewalk than to expect to find a thousand dollar bill.
2.11 If we hold P to be true and are required to choose between K
and not-K, it is sometimes reasonable to say that there is nothing to choose
between them; but sometimes this reply would be the height of absurdity.
(Take P to be: all but one of the 1000 marbles in the bag are black and
this ball was drawn at random from the bag; and K to be: this ball is black.)
2.2 People whose common sense has not been weakened by philosophy
will often agree in their judgments about the goodness or badness of given
inductive arguments. This provides some hope for the codification of in¬
ductive procedures. But there is enough disagreement to make us cautious.
2.3 If we ever met somebody who seemed to agree sincerely with that
old bogey the “counterinductionist,” we would have to treat him as a
lunatic. And we should be right.
2.31 But how would such a lunatic behave? Can one commit the
“gambler’s fallacy” all the time?

3 There is no universal criterion for the soundness of an induction.


3.1 One might try as a criterion: The truth of the premises must be a
good and sufficient reason for thinking the conclusion to be true. But this is
circular in its use of “good”; and in any case, the application of such a
criterion, if it deserves that name, would depend on recognizing when
the reasons are “sufficient.”
3.2 The question as to when inductive premises are sufficiently strong
might be compared to the question as to when the foundations of a house
are sufficiently strong. There are no infallible answers to either question—
but it is not a matter of mere guesswork either.
3.3 It is worth remembering that there is no universal criterion for
the validity of a deductive argument. Logicians comfort themselves by talk
MAX BLACK 147

about the “truth-preserving” feature of a valid deductive argument; but


upon reflection they are seen to be saying no more than that the conclusion
follows from the premises.

4 There are some formal principles that are relevant in appraising in¬
ductions.
4.1 That all the observed /f’s have been B is some reason for thinking
that any given A is a B; that most A’s are B is some reason for thinking
that an A drawn at random is a B; if P inductively supports K, and Q in¬
ductively supports K and P and Q are compatible and independent, the
argument from P.Q to K is at least as good as the argument from P to K;
and so on.
4.2 Principles such as these I hold to be “linguistically a priori,” i.e.,
to be guaranteed by what we properly mean by “reason,” “support,” etc.
4.3 Such principles provide a meager harvest and serve rather to
stigmatize as bad the inductive arguments that no sensible man would ever
use than to provide guidelines for genuine inductive inferences.

5 Inductions yield guarded assertions.


5.01 By a “guarded assertion” I mean a proposition to which is attached
some indication of the strength of support for that proposition in the light
of the available evidence.
5.02 Assertions can be explicitly “guarded” by the use of expressions
such as “almost certainly,” “very likely,” etc. which I call “confidence
indicators.” (When the risk is negligible, the indicator may be omitted.)
5.1 All inductions are precarious, in a way in which valid deductions
are not, but not precarious in the way that a rickety footbridge might be.
It is obviously useful to the hearer to know the speaker’s estimate of the
risk—or, what comes to the same thing, his estimate of the strength with
which the premises support the conclusion.
5.2 If the confidence indicator is omitted, there is a danger of fallacious
inference: Thus P may “sufficiently” support Kx and sufficiently support K2
without sufficiently supporting the conjunction KX'K2.
5.21 There should be no particular difficulty in constructing, for what it
is worth, a calculus of “guarded assertions.”
5.3 A guarded assertion is still an assertion. I take the conclusion of an
inductive inference to be “detached”—but with a cautionary signal at¬
tached. Caveat auditor: He has been warned.

6 Induction is an art, not a science or a system of mechanical routines.


6.1 The art of drawing risky inferences might be compared to the art
of mountain climbing. The climber clings to his holds, estimates the se¬
curity of the next footholds and handholds—and leaps. So also for in¬
duction.
148 1. SCIENCE AND INQUIRY

6.2 There are useful inductive strategies and maxims but no infallible
rules for getting right conclusions.
6.21 There are no infallible principles for climbing mountains in general.
But when the task is narrowed to climbing a known mountain, it is pos¬
sible to rely on determinate and helpful strategies. So also in induction:
The better defined the inductive task, the more help we can get from
previous experience of similar tasks and the obstacles to their fulfillment.
Cf. the relatively definite rules for statistical inferences, discussed in treatises
of statistical method.
6.3 Good inductive inferences call for judgment—in the sense in which
one speaks of a good judge of wine or of livestock.
6.31 A miniature inductive problem: waking up, noticing that my watch
has stopped and needing to decide whether it is time to get up yet. There
are intelligent as well as stupid ways of attacking this task.
6.32 In a good inductive solution, there are present at least the fol¬
lowing: recall and choice of relevant premises; estimation of the strength
with which they support the conclusion under examination; conflation of
several supporting strands to yield a judgment of the resultant strength
of the conclusion; a decision whether a suitable “detachment point” has
been reached, or whether suspension of belief is in order.
6.321 At point after point in this process, the reasoner’s flair, skill, good
judgment, are crucial.
6.4 All sound inductive judgment is made against a background of de¬
tailed knowledge that activates the skill and expertise built up by repeated
exercises of inductive judgment.
6.41 Asking somebody to form an inductive judgment about a skeleton
argument, presented in all the nakedness of abstract symbols, is like asking
a connoisseur to evaluate an imaginary painting.
6.5 One way to find out whether a man is a good inductive judge is to
notice how often he goes wrong without extenuating circumstances. A
good inductive judge must not fail too often.
6.51 But how often is “too often”? It takes a good inductive judge to
judge an inductive judge—if only because the arguments in question have
to be independently appraised.
6.52 We learn to become better inductive judges by scrutinizing our
failures and retroactively criticizing the trains of thought that produced
them.

7 Induction needs no general philosophical justification—and can


receive none.
7.1 Again, it is worth insisting that the same is true of deduction. Ad¬
dicts of deduction should not harbor flattering illusions.
7.2 An inductive inference purports to justify the assertion of its con¬
clusion. So a demand for justification of induction as such is as odd as a
MAX BLACK 149

demand for an explanation of explanation as such or a proof of proof as


such.
7.3 Specific challenges (demands for justification) of specific inductions
are often in order: The critic may question the imputed strength of the
reasons offered, their cumulative force, the appropriateness of the chosen
“detachment point,” and so on. Sometimes, such challenges are properly
met by further inductive arguments.
7.31 But a point is eventually reached where the best that can be said
is “That’s how it looks to me.” There is no substitute in the end for the
reasoner’s judgment.
7.32 If anything is clear in this whole subject, it is that no defense—
deductive, inductive, or “pragmatic”—will satisfy the resolute inductive
skeptic.
7.321 Offering the skeptic an inductive “justification” of induction is as
futile as telling a dipsomaniac that the snakes he sees are not poisonous.
(The psychotic who knows that two and two make four, but can’t stop
worrying about it.)
7.4 The only way to cope with a general challenge to induction is to
show that no philosophical task has been defined by the challenger. (“If
only the fool would persist in his folly he would become wise”—Blake.)
How could one “justify” mountain-climbing in general? There is nothing
to justify—and the same applies to induction.
7.41 Isn’t the trick in philosophy, sometimes, in knowing when to stop
worrying?

NOTE
1. Offered as a birthday card in affectionate tribute to Ernest Nagel, in the hope
that he might agree with at least half of what is said. This summary statement of a
philosophy of induction may conceivably help others to see precisely where and why
they disagree.
2. Structure of
Science
ON WHAT THERE MAY BE IN
THE WORLD
G. Feinberg

I. INTRODUCTION

The notion that philosophy stands behind science in an advisory role


is an old one, although it has fallen into disrepute among scientists and
philosophers in the twentieth century. One of the roots of the notion is the
view that in scientific thought, the mind not only acts to frame concepts
relating the raw sense data, but also has a prior role in determining which
concepts are considered admissible at all. Among twentieth century
scientists, Einstein was perhaps the most forceful spokesman for this view,
for instance in his statement that “. . . the axiomatic basis of theoretical
physics cannot be extracted from experience, but must be freely invented.”
I do not think this view is generally applicable, because many times in the
history of science, the observations have led scientists to accept theoretical
systems which have seemed rather unsatisfying to their inventors. The
history of the development of the quantum theory contains several examples
of this. At such times, an effort to impose a priori requirements as to what
concepts were admissible would almost certainly have hampered the under¬
standing of the phenomena, and the scientists of the time were wise to let
the experiments lead the way.
There are other times when I believe it is useful to let one’s imagination
go beyond a consideration of the phenomena being actively studied, and to
try to see what sorts of things might in principle exist in the world. This is
particularly the case when there is a wealth of experimental material in some
area being actively investigated, since that is just the circumstance in which
other phenomena may remain undiscovered, or else be ignored. The former
seems to have been the case with natural radioactivity, which could have
been discovered 50 years earlier, had it been imaginable. The latter hap¬
pened to the first discovery of parity non-conservation in /3-decay, which
was reported in 1929, or more than 25 years before it was predicted and
rediscovered.
In my opinion, we may be in such a situation now in physics, but in a
deeper sense. In the past 30 years, physics research has concentrated on
the discovery of elementary particles, and the examination of their prop-
152
G. FEINBERG 153

erties. Other aspects of nature, with the possible exception of cosmology,


have been regarded as derivative, and ultimately to be explained in terms
of the particles. This program has been quite successful in understanding
the known physical phenomena. This very success has perhaps acted as an
obstacle to a search for other objects in the physical world, not composed
of particles. It is easy to see how this could happen. Scientists today
tend to think of the results of their measurements in terms of an elaborate
theoretical model. Such a model will imply that certain aspects of the
observation are essential, while others are trivial or irrelevant. With such a
bias, it is easy to miss some phenomenon which emphasizes a different
subset of aspects. For instance, one might imagine some effect which
caused the mass of an atom to vary according to the phase of the moon.
An effect of this kind Is not only likely to be overlooked if it occurred, but
even to be dismissed out of hand.
It cannot be expected that working scientists will spend much of their
time pursuing wills-o’-the-wisp. Rather, one should draw the conclusion
that there may be things in nature independent of particles that have been
missed because no one has looked for them. If so, it would seem to be of
value to imagine what such things might be, and perhaps more important, to
determine how to look for them. Imagining what might be has been a part
of what some call philosophy, and it is in this sense that I believe that
philosophy may again influence the development of science.
In this article, I shall sketch a few possible physical entities or
phenomena that I think could exist without having been discovered until
now. I shall start by indicating the theoretical presuppositions that I will
make, since the possibilities are too numerous without such presuppositions.
I shall then discuss a few of the possibilities in more detail, indicating how
one might look for them.

II. THE RELATION BETWEEN WHAT WE KNOW AND


WHAT MAY BE

One of the most remarkable differences between Newtonian physics and


quantum physics is that the behavior of a given quantum mechanical system
is influenced not only by its constituents and its surroundings, but also by
all other physical systems that can exist. This influence is a consequence
of the possibility of spontaneous creation and destruction of new objects,
which is perhaps the most important consequence of the theories achieved
in the late 1920’s which unite quantum mechanics and special relativity.
We do not yet know all of the physical laws that determine which physical
systems occur in nature, that is, can be observed under some circumstance.
Nevertheless, it is a prediction of any relativistic quantum theory that if
154 2. STRUCTURE OF SCIENCE

some system does exist in this sense, then there is a finite probability of
spontaneously creating the system, provided only that sufficient energy is
available.
Even when a physical system does not contain enough energy to spon¬
taneously create other objects which will persist for an indefinite time, it
can still do so for a very short time, because of the uncertainty relation be¬
tween time and energy, which allows for violations of energy conservation
over short periods of time. Hence any physical system will again and again
spend short periods of time as another physical system, containing other
objects, which would not be found in the first system by measurements which
last over time periods commonly employed in the laboratory (by this I mean
times greater than 1019 seconds). These spontaneous creations and de¬
structions of new objects, known as virtual transitions, play a major role in
determining the properties of most of the elementary particles. In this case
the new objects created and destroyed are usually assumed to be other ele¬
mentary particles. The possibility of virtual transitions seems to be much less
important for determining the properties of collections of particles such as
nuclei, atoms, or macroscopic objects, although even there in some cases
precise measurements can detect their effects.
If other objects than particles and their aggregates exist and obey the laws
of quantum mechanics and relativity, then they also will have some effects
on the properties of things we already know, because of the possibility of
virtual transitions. This indirect effect on our present observations places
some severe restrictions on the properties of hypothetical new objects. In
particular, the conservation laws of energy, momentum, angular momentum,
and charge would have to be satisfied for these new objects, or else, through
the virtual transitions, they would be violated for particles as well, in con¬
tradiction to experiment. On the other hand, we shall see that the existence
of certain hypothetical objects would enable the preparation of states of
elementary particles which although allowed by relativistic quantum me¬
chanics are not preparable using particles alone.
The possibility of other objects than particles is contained in relativistic
quantum theory because neither relativity nor quantum mechanics prescribe
what objects do exist. Both of these theories do place restrictions on the
properties of objects, and both tell about how to describe the behavior of
objects that are known to exist, but neither requires that any of the possible
objects that they describe are indeed to be found. It is not clear how to
formulate a physical theory that does prescribe exactly what objects exist,
without ad hoc assumptions. This is a problem in which a careful analysis of
language might play an important role.
It is of course clear that since we and our measuring instruments are con¬
structed of particles, any other objects that may be found in the world can
only show themselves to us through their interaction with particles. There¬
fore, in these remarks, I shall concentrate on this interaction, and describe
G. FEINBERG 155

what novelties might occur in the behavior of particles as a result of inter¬


actions with things that are not particles. It is by no means clear that all
aspects of our hypothetical entities can be probed in this way, or that there
is any unique correspondence between the new entities and the anomalous
behavior of the particles, or measuring instruments made of them. However,
that is all we have to go by, and it necessarily must suffice. This is not the
only respect in which we view the universe as through a glass darkly.

III. FIELDS WITHOUT PARTICLES

A long line of physicists, from Faraday to Einstein, have found the field
an intellectually satisfying concept for describing physical phenomena. By
a field is meant here an extended region of spacetime in which some meas¬
urable disturbance is to be found. In the contemporary formulation of par¬
ticle physics, this notion does not play a central role. This can be said in
spite of the fact that the formalism used by most physicists to describe ele¬
mentary particles is called quantum field theory. The point is that while
this formalism deals with operators which depend on spacetime, as classical
fields do, these operators may be thought of as descriptive conveniences
used to formulate certain assumptions about elementary particles. Indeed,
some progress has been made, under the name of the S-matrix theory, in
purging the notion of fields from elementary particle physics altogether.
Whether or not this can be done, it is clear that the fields that are now con¬
sidered are not independent of the particles that they are used to describe.
It is therefore of interest to raise again the possibility that such inde¬
pendent fields may exist in nature. The meaning of this suggestion is the
following. We imagine a region of spacetime in which energy and momen¬
tum occur, and can be exchanged with our measuring instruments. How¬
ever, this concentration of energy and momentum, and perhaps other
physical quantities, should not be simply a superposition of energy, momen¬
tum, etc. of distinct particles. The latter would for example be the case for
an electromagnetic field, which can be thought of as a superposition of
various numbers of photons, or light quanta. I shall refer to the independent
fields as Faraday fields. A non-quantized field would be a Faraday field, but
we have learned from Bohr and Heisenberg that the inner logic of the quan¬
tum theory requires that anything interacting with quantized matter must
itself have quantum properties, so that purely classical fields cannot exist.
In particular, the gravitational field must be quantized, and so does not
necessarily qualify as a Faraday field, although it may have one such aspect.
Let us next consider what properties the Faraday fields should have, and
how to go about detecting them. I shall assume that they satisfy the prin¬
ciples of relativistic quantum mechanics. I do this here to avoid the multi¬
plication of new hypotheses. In a later section, I shall try to describe
156 2. STRUCTURE OF SCIENCE

hypothetical objects not obeying these laws, and how to detect them. I also
assume, for reasons that I have previously indicated, that the Faraday fields
satisfy the conservation laws that hold for the elementary particles.
The principles of relativistic quantum mechanics guarantee that energy
and momentum will be associated with Faraday fields. However, this energy
and momentum will not be a precisely defined quantity. Nor in general need
the observables associated with particles, such as charge, have a well-defined
value for the fields. This in itself would be a new situation, since all known
states containing a definite number of elementary particles always have
definite charge. States with indefinite charge may be used as reference states
for the preparation of elementary particle states having well-defined values
for observables that do not commute with charge. This would be analogous
to the use of objects with well-defined positions to produce states that are
superpositions of distinct momentum states. If states of elementary particles
which are not eigenstates of charge can be prepared, it would necessitate
the re-examination of the so-called superselection rules1 that play a prom¬
inent role in the theory of symmetries in particle physics.
I have remarked that if the fields are to affect our measuring instruments,
which are made of particles, then there must be a specific interaction be¬
tween the fields and particles, which will transfer momentum, energy and
other physical quantities to and from the particles, i.e., scatter them. Let us
ask what properties could characterize field-particle scattering that would
be characteristic of it, as compared, say, with particle-particle scattering.
One such characteristic comes from the fact that the field will not have
definite angular momentum. As a result, the scattering of the particle by
the field will in general depend on all the individual components of the par¬
ticle’s momentum before and after its collision with the field. This is to be
contrasted with the case of particle-particle scattering, where only certain
definite scalar functions of the momenta occur. The reason for the difference
is that the field will in general not be isotropic, and so define directions in
space, and the scattering will in general depend on the angles between the
particle momenta and these directions.
The indefiniteness of the fields’ linear momentum and energy do not seem
to have any simple consequences for the identification of the fields, since
these quantities are in any case continuously variable for particles. How¬
ever, the indefiniteness of electric charge may also be a useful tool for the
distinguishing of a field from a set of particles. For example, by measuring
the electric flux through the surface of a volume, one can determine the
average value of the total charge in the region. For a set of particles with
definite charges, this will always be an integral multiple of the unit charge.
For a field without definite charge, this will not in general be the case. How¬
ever, very accurate measurements of the flux would be required to determine
this.
If Faraday fields are identified, it will become an interesting question to
G. FEINBERG 157

what extent they are mathematically independent of the fields used to


describe particles. One could imagine that the same mathematical field
theory could possess some solutions which behave like Faraday fields and
others which behave like the fields associated with particles. Alternatively,
it could be that if Faraday fields exist they are described by mathematically
distinct objects. Presumably the former circumstance would manifest itself
as a particular kind of interaction between the Faraday field and the particles
associated with the same mathematical field theory. One source of such an
interaction would be the quantum conditions imposed on the mathematical
fields, which would produce a kind of correlation between the particle wave
function and the field strength in a given region. The precise form of the cor¬
relation remains to be worked out.
Such a mathematical relation between the discrete objects we call particles
and the continuous Faraday field would be an example of one entity mani¬
festing itself in two apparently distinct forms, somewhat as Venus shows
itself sometimes as the morning star and sometimes as the evening star. We
can imagine a more dependent relation between particles and an underlying
continuous substratum. These will be discussed in the next section.

IV. A POSSIBLE SUBSTRATUM FOR ELEMENTARY PARTICLES

The complementary notions of discrete particles and a continuous fluid


have shared the imaginations of physicists throughout history, with one or
the other dominating the views of an individual physicist about the ultimate
structure of matter. In the nineteenth and twentieth centuries, the question
was apparently settled permanently by the atomic theory and the discovery
of the elementary particles. Again Einstein was the leading opponent of this
conclusion among twentieth century physicists, with his view that particles
were some kind of singularity in the spacetime structure of general relativity,
or roughly speaking, were regions in which spacetime was strongly distorted.
This view did not seem to have much to recommend it in the early 1930’s
when it appeared possible to understand the world in terms of a few rela¬
tively stable particles, i.e., neutrons, protons, and electrons. However, in
view of- the large number of particles discovered since then, and the ease
with which they are created, destroyed, and transformed into one another,
the case for the conclusion that particles are the ultimate constituents of
matter seems to me to have been weakened. This is not to say that bulk
matter is not well understood in terms of the particles we know. Rather, it is
the recognition of a psychological discomfort with the idea that those things
which cannot be analyzed any further should have properties as complicated
as the particles do.
It is possible that this situation will be remedied within particle physics,
if some of the speculations involving “quarks” turn out to be valid. Accord-
158 2. STRUCTURE OF SCIENCE

ing to these ideas, most or all of the observed particles are composed of a
small number of other particles, the “quarks,” as yet unobserved, with
somewhat simpler properties. This is somewhat like the situation would be
if we knew only heavy atoms, and had not discovered electrons, protons,
and neutrons. In my opinion, this alternative is likely to be realized only
together with major revisions of relativistic quantum mechanics which we
do not yet know how to carry out. In particular, the quark hypothesis ap¬
pears difficult to harmonize with the possibility of virtual creation and an¬
nihilation of particles which is characteristic of relativistic quantum theories.
If a simple understanding of the behavior of the particles is not to be
found within particle physics itself, then perhaps it can be obtained by
giving up the idea that particles are the ultimate structures in nature, and
explaining them as being composed by some underlying, non-particulate
stuff. Some of the phenomena seen in particle physics are suggestive of this
possibility. Although the particles change their number and kind easily, cer¬
tain quantities like energy and charge remain unchanged in these trans¬
formations. Other quantities, like the hypercharge, do change, but so rarely
that one can neglect their variation in many cases. These phenomena have
their analogues in the behavior of excitations of a fluid. There also some
quantities, such as the mass and the vorticity, that are exactly conserved,
even though the individual excitations may be created and destroyed. The
excitations in a classical fluid do not show any of the discreteness of par¬
ticles, such as definite mass, charge, and angular momentum. However,
one can expect that this difficulty would not occuur for a quantized fluid,
since making a quantity into an operator often makes its spectrum discrete.
It is not the purpose of this short section to propose a detailed theory in
which particles are described as excitations of a continuum. Rather, I would
again like to try to give some meaning to this idea in terms of what phe¬
nomena it suggests could be observed. I shall refer to the underlying con¬
tinuum as the fluid, even though this language can only be figurative.
To say that particles are excitations of an underlying fluid would be a
more empirical statement if one could observe the fluid in an unexcited state.
If this could be done, one might think that we could trace the “motion” of
the fluid composing the particle into and out from the background fluid when
the particle went through some changes. Of course, if the fluid obeys quan¬
tum mechanics, its motion can be followed only in an approximate way, and
this seems to make such a direct determination of a substratum for particles
difficult to carry out. Nevertheless, it is instructive to examine what proper¬
ties the unexcited fluid or substratum might have.
We assume that the substratum can carry energy, momentum, charge and
the other conserved attributes of particles. However, when the fluid is in an
unexcited state, these will not necessarily be observable. The point is that
energy, for instance, is defined relative to a reference state, which is taken
to have zero energy. If the fluid is uniformily distributed in space, then it
G. FEINBERG 159

would be natural to take it as having zero energy in its unexcited state. On


the other hand, if the fluid is only found in parts of space, or if its density
varies from point to point, then its energy cannot be uniformily zero. In the
latter case, it would be possible for the unexcited fluid to exchange energy,
or other quantities with particles, and one could observe fluctuations of the
values of these quantities measured for a particle as it passed through a
region free of other particles. It would be of interest to know experimentally
whether such fluctuations do occur.
A new possibility arises in connection with quantities such as the hyper¬
charge, an attribute of some particles that is additive like the charge but is
not exactly conserved for the particles. One could imagine that hypercharge
is not conserved when considered as a property of the fluid either, but rather
can change slowly. Alternatively, it could be that hypercharge is a con¬
served property of the fluid, but that there exist excitations of the fluid
which contain hypercharge without containing the other exactly conserved
attributes of particles. In the former case, one would expect that the values
of hypercharge for the fluid would not be restricted to integral values as for
particles. This is under the assumption that the hypercharge of the entire
fluid changes continuously from its initial to its final value. The possibility
of detecting such non-integral hypercharges must await some method of
measuring hypercharge other than counting, which has not yet been devised.
Such a method is available for electric charge, since this acts as the source
of the electromagnetic field. If non-integral values of electric charge can
also occur in the substratum, then these might be detected by measuring
their electric flux, as described in the discussions of Faraday fields.
The alternative possibility that hypercharge is exactly conserved and that
there exist states carrying hypercharge, but not energy or momentum has
been proposed within the context of particle physics by T. D. Lee.2 Again, a
direct confirmation of their existence must await a new method of measuring
hypercharge. One could hope to detect them indirectly by seeing them stim¬
ulate transitions of particles in which the hypercharge changes. The rate at
which such transitions take place would be proportional to the density of
the hypercharge-carrying excitations of the fluid. If these are indeed momen¬
tumless, the density should be constant over very large regions of space,
and so should the rate of transitions in which a particle’s hypercharge
changes, in agreement with experience. This rate will then contain two terms,
one coming from spontaneous emission of these excitations, the other from
induced emission and absorption of them. The spontaneous rate probably
cannot be modified, but it seems reasonable that we could modify the in¬
duced terms by modifying the density of the excitations. This might be done
through a sort of shielding by other particles. If this were possible, it would
show up as a variation of the rate of hypercharge non-conserving transitions
of particles with environment. It would be of interest to make a systematic
search for such effects.
160 2. STRUCTURE OF SCIENCE

If any of the phenomena we have described are discovered, it would not


demonstrate that particles are composed of a continuous fluid, but perhaps
only that such a fluid does exist. It is hard to avoid the conclusion that the
compositeness of particles would at first have to be decided theoretically.
That is, after determining that such a fluid exists and learning about some
of its properties through direct experiment, one would try to calculate some
of the known properties of particles by treating them mathematically as ex¬
citations of the fluid. This would probably suggest other properties of the
fluid that could also be measured, as well perhaps as new phenomena in¬
volving particles. By such a procedure one might gradually become con¬
vinced that particles were excitations of an underlying fluid. Our knowledge
of atoms and the hypothesis that all bulk matter is made of them were also
arrived at by a combination of “as if” reasoning with direct measurements,
and it is not impossible that such a process should occur again on a new
level.
One may also ask what manifestations a fluid substratum might have
other than particles. It is attractive to speculate that some of the properties
we attribute to spacetime might be understood in terms of such a fluid. Gen¬
eral Relativity is one step in this direction, since we can consider the gravi¬
tational field as an example of a fluid which determines the metric properties
of spacetime to some extent. Our spacetime has other properties, such as a
definite topology and dimensionality, which are usually taken for granted.
At some point physics will have to deal with the question of whether these
properties are accidental or contingent, and such a question would seem
more natural if there is a connection between spacetime and the fluid sub¬
stratum. However, the nature of this connection is not something that can be
easily imagined at present.

V. OBJECTS THAT OBEY NEW PHYSICAL LAWS

The objects that we have considered thus far were still supposed to satisfy
quantum mechanics and relativity, although differing in some qualities from
the objects we know, i.e., particles. In this section, we consider the hypoth¬
esis that objects exist which satisfy other laws, although still interacting with
particles. I shall for the most part concentrate on revisions of quantum me¬
chanics, which I consider the more fundamental set of laws. Since particles
satisfy quantum mechanics and relativity rather accurately, the hypothesis
is tenable only under one of two alternatives. The first is that the violations
of quantum mechanics and relativity by the new objects lead to effects in¬
volving ordinary particles that are too small to have been observed thus far,
but which might be observed through better experiments. The analysis of
this possibility is fairly straightforward once the violations are specified.
The other alternative, which is somewhat more intriguing, is that particles
G. FEINBERG 161

may satisfy the known laws of physics exactly, although interacting with
new objects which do not precisely satisfy these laws. It has been known3
since the pioneering analyses of Heisenberg, and of Bohr and Rosenfeld,
that some restrictions must exist on the properities of any objects inter¬
acting with another object, such as an electron, which precisely obeys the
laws of quantum mechanics. For example, it is known that if the electro¬
magnetic field were a classical field interacting with electrons, then the elec¬
trons could not obey quantum mechanics exactly, because it would be
possible to make measurements on them, using electromagnetic fields, that
would violate the uncertainty principle. However, it is not known how
strong these restrictions are in general, and in particular whether they imply
that all of a set of interacting objects obey quantum mechanics if any of
them do. If this is not the case, then it would not be so surprising if non¬
quantum systems are found in nature.
It should also be remembered that experiments have by no means tested
all aspects of quantum mechanics for particles. In most cases, there is no
known experimental procedure for preparing a state which is the super¬
position of two given states, or of measuring a quantity which is some
function of measurable quantities. Therefore, it is conceivable that quantum
mechanics could be replaced by some weaker theory, without losing agree¬
ment with what we know from experiment. No systematic study of this ques¬
tion has been carried out to my knowledge, which is surprising in view of
the reputed desire of the founders of quantum mechanics to be guided by
what is experimentally feasible. I shall not, however, consider this possibility
any further here.
In considering what new laws the hypothetical objects might obey, we are
again faced with several alternatives. One would be that these laws are more
stringent than those of relativity and quantum mechanics, in that they give
definite values for quantities that are allowed to vary in those theories. For
example, one could imagine some physical system for which the momentum
could take on only certain values, rather than arbitrary values as in current
theories. This general possibility might be referred to as hyperquantization.
It is clear that if a system A, in which some variable V is hyperquantized
interacts with another system B, in which the variable V is not hyper¬
quantized, the values of V for the system B can still only change by definite
amounts, equal to the difference of two of the allowed values in A providing
that V is conserved. This was the argument used by Franck and Hertz to
demonstrate energy quantization in atoms. It is suggestive to try to explain
some of the approximate conservation laws mentioned in section IV in these
terms.
A second alternative is the existence of new objects about which even less
information can be obtained than present theory allows. According to quan¬
tum mechanics, there is a relation between the accuracy with which position
and momentum can be determined in one experiment, but either one can
162 2. STRUCTURE OF SCIENCE

be determined with arbitrary accuracy. We can imagine systems for which


the latter is not the case, but rather that the momentum, for example, can
never be measured with more accuracy than some value Ap- It is occa¬
sionally proposed, without much foundation, that such limitations may
already exist for measurements on ordinary systems. This would appear
more likely for new objects, whose properties are in any case to be deter¬
mined through inferences from the behavior of particles. It should be pos¬
sible to devise a mathematical theory to describe the situation pictured
here, of objects whose momentum can be measured with only limited ac¬
curacy because of the way they interact with particles.
If such objects have a momentum-energy relation similar to that of par¬
ticles, i.e., have a well defined mass, then their energy will also be only
approximately measurable. In that case, if we preserve the quantum-
mechanical relation between the energy of a system and its time variation,
we are led to conclude that these objects will have “internal clocks, i.e.,
will not behave similarly at different times. If these natural clocks do exist
they might have some role in determining when processes such as particle
decays occur, which according to ordinary quantum mechanics occur at
random for an individual system.
The cases considered thus far involve modifications of some properties of
physical systems, without, however, changing the basic description of the
system, which in quantum mechanics is through a vector in a linear space.
At the core of this description is the notion that alternative possibilities are
to be added in amplitude, rather than in probability, in determining an out¬
come. It has been proposed by some physicists that this rule may be valid
for alternatives at a given time, but not for the development of a system in
time. In these proposals, the Schroedinger equation would be replaced by
an equation non-linear in the wave function. There is some evidence against
this idea insofar as the behavior of particles and atomic systems is concerned.
Again, it seems to me that it is more likely to be a new kind of object that
does not follow this superposition principle of quantum mechanics. These
objects would display characteristic fluctuations in time of such quantities
as energy, according to their initial state. Fluctuations of this sort would
probably be detectable, provided that they occurred on a reasonably long
time scale, say greater than 10 12 sec. This could be done by measuring the
energy transferred when the object is absorbed in some detector, as a func¬
tion of the time interval from its production. Similar methods have been used
to detect time fluctuations which can occur in ordinary quantum mechanics.
It is not clear that objects obeying a non-linear equation of this type can
interact with particles without introducing a similar non-linearity into the
equations of the particles. This could even happen when the non-linear
objects were not physically present, through virtual transitions of the sort
discussed previously. This would conflict with the evidence that particles
G. FEINBERG 163

do obey linear equations. It therefore is likely that if a theory can be con¬


structed in which a linear Schroedinger equation is maintained for the
particles, this will entail giving up the possibility of spontaneous creation
and annihilation of the non-linear objects. Although this possibility is con¬
tained in all of the relativistic quantum theories we know, it is not known
whether it would also persist under specific modifications of these theories.
For that reason, one cannot disprove the existence of the non-linear objects
by the arguments indicated. Nevertheless, one would like to know how well
the assumption of a linear Schroedinger equation is satisfied by particles, in
order to see how severe the constraints are upon any theory involving non-
linearities.
This general difficulty might be avoided if the new objects are the material
of which particles are composed, in the sense of section IV. It is possible
to construct models in which some continuous function satisfies rather com¬
plicated dynamical equations, while certain functionals evaluated from it
satisfy much simpler equations. By relaxing some of the requirements of
relativistic quantum mechanics for the constituents of the known particles,
whether these constituents be particulate or continuous, it might be possible
to resolve some of the difficulties alluded to earlier in understanding the
particles, without giving up relativistic quantum mechanics for the particles.
Some physicists are actively working along these lines.

VI. CONCLUSION

I have indicated a number of possibilities for physical entities different


from elementary particles and their composites. Some of these were meant
to be separate from particles, others to be the stuff of which particles are
made. It will not have escaped the reader that the methods proposed to
search for these new entities do not differ much from entity to entity, being
in general careful measurements of some of the properties of particles, such
as energy, etc. I do not see how this can be avoided, unless one were to
assume that these new objects play an important role in some macroscopic
system, which then might be used to detect them. I do not think that this is
possible in any phenomena known to us, with the conceivable exception
of mental processes and cosmological processes. Hence, insofar as physics
experiments on earth are concerned, we might as well restrict ourselves to
studying particles.
Because of the limited number of qualities that we can measure, it would
not always be clear how to interpret a specific effect if it were detected. For
example, I have suggested non-integral electric charges as a manifestation
either of charge-bearing Faraday fields, or of a fluid substratum. While it is
possible to propose a more detailed set of experiments to resolve specific
164 2. STRUCTURE OF SCIENCE

ambiguities of this type, a general answer to such questions would require


something more elaborate than is warranted until some evidence for these
effects is found.
Another comment that might occur to the reader is that I have been in¬
sufficiently imaginative by not considering objects that have completely new
properties, qualitatively different from the ones we know. I would remark
that this is partly a matter of definition. To some extent, the introduction
of a new quality is equivalent to the recognition that two objects previously
thought distinct have some features in common. The attribution of a new
quality, color, to light is the recognition that a red light and a blue light have
many properties in common. Similarly, I would argue that the new proper¬
ties associated with particles, such as spin, were introduced after the recog¬
nition that apparently distinct states were related. In the case of hypothetical
new entities, the attribution of new qualities to them should not be the first
step, since to detect them we must know how they influence our instruments,
and for that purpose it is sufficient to consider the individual states of the
new entity before going on to study relations between them.
The objects that I have considered are by no means a complete set of
possibilities. Nor is there any real indication at present that anything other
than particles exists in the world. Nevertheless, I would not be surprised if
a systematic experimental search uncovered some aspects of reality not pre¬
viously dreamt of in our physics.

It is a privilege to be able to dedicate these thoughts to Ernest Nagel,


whose unswerving devotion to truth and understanding are a constant in¬
spiration to those who have known him.

NOTES
1. These superselection rules essentially say that no experiment can distinguish
between a superposition of states with different charge, and an incoherent mixture
of such states. They were first described in a paper by Wick, Wightman and Wigner
in Physical Review 88, 101 (1952).
2. T. D. Lee, Physical Review 137, B1621 (1965).
3. See W. Heisenberg, The Physical Principles of the Quantum Theory, Dover,
1930.
ON CARTESIAN AND DARWINIAN
ASPECTS OF BIOLOGY1
Theodosius Dobzhansky

In The Structure of Science Nagel (1961) wrote: “It is a mistake to


suppose that the sole alternative to vitalism is mechanism. There are sectors
of biological inquiry in which physicochemical explanations play little or
no role at present, and a number of biological theories have been success¬
fully exploited which are not physicochemical in character. . . . Thus there
is a genuine alternative in biology to both vitalism and mechanism—namely,
the development of systems of explanation that employ concepts and assert
relations neither defined in nor derived from the physical sciences.” I
believe that both biology and philosophy would profit if this very lucid
statement were to become more widely known and better understood than it
is in actuality. The present article is really an attempt at exegesis on Nagel’s
statement. I am, however, stating also some views of my own, for which
Professor Nagel is evidently in no way responsible.
The old-fashioned thoroughgoing vitalism of Harvey, Wolff, Bichat,
Driesch, or Bergson has been a dead issue in biology for at least half a
century. Nothing is more indicative of the present situation than that the
few modern adherents of vitalism do not willingly admit being vitalists.
Mechanism has triumphed in biology certainly not because all life processes
have been exhaustively described in physical and chemical terms. No reason¬
able mechanist seriously plans to have such a feat accomplished in near, or
even in remote, future although he affirms that this is possible in principle.
Vitalism has been rejected, rightly in my opinion, because it has turned out
to be unnecessary and unprofitable as a guide to discovery. Mechanism, on
the contrary, performs this function extremely well.
The present situation in biology is much more interesting than it was
when the banal and wearisome vitalism-mechanism controversy was hold¬
ing the attention of many biologists. There are in biological studies two
contrasting, though complementary rather than alternative, approaches or
methodologies. One is Cartesian or reductionist, the other Darwinian or com-
positionist.
Descartes considered living bodies, including human bodies, to be autom¬
ata, i.e., machines describable in physical and eventually in mathematical
terms. They are, accordingly, “studiable” by means of the famous Cartesian
method. We must divide them, like any complex phenomena, into simplest
165
166 2. STRUCTURE OF SCIENCE

components amenable to study. These components must be described in


clearest and most unambiguous mechanical, physical, and chemical terms.
The hope is that, in the fullness of time, when the chemical study of the
components of the living processes will have advanced far enough, the more
complex biological phenomena will be seen clearly as patterns of simpler
chemical and physicochemical ones. Reduction is “the explanation of a
theory or a set of experimental laws established in one area of inquiry, by a
theory usually though not invariably formulated in some other domain”
(Nagel, 1961). Biological explanations and laws will thus be shown to be
special cases of the more “powerful” chemical explanations.
Reductionism in biology is often hedged with reservations. One of
them alleges that the presently available knowledge of physics and chemistry
may not be quite sufficient for a wholly satisfactory account of the biological
phenomena. However, more complete knowledge and more advanced
methods will eventually be powerful enough for the purpose. Cybernetics
and the rapidly progressing computer technologies are often mentioned as
examples of such powerful methods which have only recently become avail¬
able, and which are likely to be greatly improved. It is, of course, reasonable
to expect that there will be progress both in biological and in physical
sciences, but I do not regard this appeal to a conjectural future as changing
the reductionist methodology in any important way. An eloquent statement
of a modern form of the Cartesian creed in biology is given by Asimov
(1960) in the following sentences: “Modern science has all but wiped out
the borderline between life and nonlife. Nowadays the question ‘What is
life?’ is asked by physicists as often as by biologists. In fact, biology and
physics are merged in a new branch of science called biophysics—the study
of the physical forces and phenomena involved in living processes. . . . And
it is to biochemistry (‘life chemistry’) that biologists today are looking for
basic answers to the secrets of reproduction, heredity, evolution, birth,
growth, disease, aging, and death.” And further: “A machine, we have seen,
can calculate, remember, associate, compare, and recognize. Can it also
reason? The answer again is yes.”
Whether reductionist explanations are by themselves sufficient is, how¬
ever, questioned by not a few biologists. Warren Weaver (1964) wrote that
“A person usually considers a statement as having been explained if, after
the explanation, he feels intellectually comfortable about it.” The feeling of
comfort is, of course, a matter of individual taste, and on matters of taste
it is difficult to arrive at a unanimous agreement. Many biologists will,
however, agree with the following statement of Dubos (1965): “In the
most common and probably the most important phenomena of life, the
constituent parts are so interdependent that they lose their character, their
meaning, and indeed their very existence, when dissected from the function¬
ing whole. In order to deal with problems of organized complexity, it is
THEODOSIUS DOBZH AN SKY 167

therefore essential to investigate situations in which several interrelated


systems function in an integrated manner.”
To feel “intellectually comfortable” about our understanding of bio¬
logical phenomena, Darwinian, or compositionist, explanations are re¬
quired. The matter can best be stated in Simpson’s words (1964): “In
biology then, a second kind of explanation must be added to the first or
reductionist explanation made in terms of physical, chemical, and mechan¬
ical principles. This second form of explanation, which can be called com¬
positionist in contrast with reductionist, is in terms of adaptive usefulness
of structures and processes to the whole organism and to the species of
which it is a part, and still further, in terms of ecological function in the
communities in which the species occurs.” It should be made perfectly
explicit that a biologist is not forced to choose between reductionist and
compositionist explanations. They are not only compatible, but are equally
necessary, for the good reason that they are complementary.
What is, however, the “adaptive usefulness” in terms of which com¬
positionist explanations are to be stated? It is a curious fact that many, even
most, basic concepts of biology have proved intractable for exact definition.
Nobody has succeeded in inventing fully satisfactory definitions of what is
life, individual, species, mind, self-awareness. Many have tried but failed
to define adaptation rigorously. It is not my intention to add another in¬
effectual attempt. The situation is not as bad, however, as it may seem to a
nonbiologist. It happens only rarely that one is confronted with a situation
in which it is difficult to decide whether something is or is not alive, or
whether one observes a single individual or two or several. Doubts about
individuals or populations belonging to a single species or to different ones
are more frequent; historically, these difficulties proved a blessing in dis¬
guise, since they suggested and eventually proved that species are not
created but evolving entities. What is uncertain indeed is whether animals
other than Homo sapiens have minds and self-awareness; strictly speaking,
each of us is aware only of his own.
When a biologist states that an organism is adapted to certain environ¬
ments, he means that this organism can survive and reproduce in these
environments. An adaptive trait is, then, a structural or functional char¬
acteristic, or more generally an aspect of the developmental pattern of the
organism, which enables or enhances the probability of this organism sur¬
viving and reproducing (Dobzhansky, 1956). Adaptedness is a state of being
adapted; adaptation refers to the process of becoming adapted; and adapta¬
bility means that the organism concerned can remain or can become adapted
in a certain range of environments.
When words are borrowed from everyday language to serve as technical
terms, misunderstandings are liable to result. “Adaptation” is plagued with
ambiguity, for it is used also in contexts which are biologically irrelevant.
168 2. STRUCTURE OF SCIENCE

Pieces of furniture or implements or machines are said to be “adapted for


certain purposes. Biological adaptation is concerned with survival and/or
reproduction; it is found only in living bodies; a cadaver is no longer
adapted, although certain organisms are adapted to live on cadavers.
Adaptedness first arose with the origin of life, since this life did not become
extinct; the origin of life was, however, not adaptation. Some critics claimed
that adaptation is a tautology, because what lives must be adapted to live.
This is not so. No organism is adapted in the abstract; it can only be adapted
to certain environments. Man is not adapted to feed on pasturage, while
horses and cows are so adapted; palms and banana do not survive in
Canadian forests, while larches and spruces do; certain microorganisms
grow on laboratory media and others do not.
Real difficulty has its source in that adaptedness refers both to survival
and to reproduction. An organism obviously has to survive in order to
reproduce, and it has to reproduce in order to survive in the next generation.
The viability and the reproductive capacity are in general positively cor¬
related, but not perfectly so. Virgin females of Drosophila flies live longer
on the average than those inseminated and actively laying eggs. Mules
and certain other species hybrids, both in animals and in plants, are
quite viable and vigorous but completely sterile. One is forced to distinguish
adaptedness of individuals and of populations, such as species, and to recog¬
nize that the two may occasionally be at cross-purposes. The honey-bee
workers are richly provided with structural and behavioral features enabling
them to collect food and to build nests; they are, however, underdeveloped
sterile females, and their stingers have reversed barbs which cause a bee
that stings an enemy to commit suicide. This suicide is probably adaptive
as far as the defense of the colony and of the reproductive individuals
(“queens”) is concerned. Neither solitary bees nor most colonial species
have this ambivalent adaptive trait.
The statement that an individual organism is alive does not tell us either
the quality or the degree of the adaptedness. The same is true of the adapted¬
ness of populations and of species. Substantive research is required to dis¬
cover to which environments the organism is adapted, by what means, and
to what extent. Perfect individual adaptedness would enable the individual
to live forever. This is not as fantastic as it may sound. In some species of
trees, one of which is the California redwood, not only may individual trees
stand for centuries, but since they are capable of stump-sprouting, they may
live apparently as long as the external environment remains propitious. This
individual near-immortality does not guarantee perdurability of the species;
in fact, the California redwood is a relict species in danger of extinction.
By contrast, certain insect species in which the individual is very short¬
lived seem to be thriving and remarkably adaptable.
The conflict between individual and species adaptedness is, after all,
resolvable. It is resolved by a compromise. Reproduction is a necessary,
THEODOSIUS DOBZHANSKY 169

though not sufficient, condition for the perpetuation of species; survival of


individuals is a necessary, though again not sufficient, condition for the re¬
production. Individual adaptedness for survival is generally highest before
and during the age of reproduction, and it dwindles toward the close of
the reproductive age and thereafter. Long-lived organisms generally have
extended reproductive periods. Very old redwood trees still bear cones with
viable seed. One may ask why the adaptedness does not lapse immediately
after the last offspring is produced, and why so many common and eco¬
logically successful organisms are short-lived, instead of living and repro¬
ducing forever.
Except where the postreproductive individuals consume food, or other¬
wise deprive the young of the wherewithal for living, the species would
gain no advantage from killing them off quickly. A rapid extermination
would require a very special set of mechanisms that would at the close
of the reproductive period at once destroy the individual adaptedness which
until then was in the interest of the species to maintain. What happens in
reality is that the resistance to diseases and environmental hazards declines
gradually. In senile individuals some organ systems stubbornly continue to
operate normally until the very end. In medical writings the reaction of the
body to disease is sometimes referred to as “adaptation” (see Dubos, 1965).
This apparently contradictory usage may be brought in line with the more
customary one if it is understood to mean that a disease is a manifestation
of an incompletely successful struggle of the organism against environ¬
mental insults and internal disharmonies; in other words, a disease may be
construed as an adaptive reaction of the organism which has at least tempo¬
rarily miscarried.
An indefinite prolongation of life and of the reproductive capacity would
lead to a population increase. This would sooner or later be curbed by
Malthusian checks resulting from overpopulation. It would also interfere
with further evolution. After all, evolution often involves replacement of
one form of life by another (anagenesis). These arguments seem cogent
enough on the surface, but they have serious weaknesses. Evolution is op¬
portunistic, and it rarely can sacrifice an immediate advantage for eventual
gains. Improved adaptedness which results in an increase of the longevity
and a prolongation of the reproductive period will in general be promoted
by natural selection.
Is there an escape from this quandary? Probably it is difficult for physio¬
logical reason to achieve a body organization that would not wear out with
age. Yet the redwood is an example of an organism which has come close
to this achievement. Furthermore, what may be approaching adaptive “per¬
fection” in one environment is likely to become imperfect when a new en¬
vironment comes on the scene. Temporary adaptive perfection may mean
environmental overspecialization. All these considerations are, however, no
more than educated guesses. If I could write science fiction, I would like to
170 2. STRUCTURE OF SCIENCE

imagine a world with an absolutely constant environment, populated by


perfectly adapted organisms who have achieved individual immortality. They
would, however, have to dispense with the joys of parenthood.
It has already been pointed out above that the Cartesian and the Dar¬
winian approaches to biology are not competitory but complementary. This
is sometimes challenged and the question is asked: Are there biological
phenomena inaccessible to Cartesian, reductionist explanations? In this con¬
text, the question reveals a miscomprehension. All biological phenomena
have Cartesian and Darwinian aspects; they can be fully understood only
when both aspects are seen in their interrelations. A biologist is faced with
several hierarchically superimposed levels of integration of structures and
functions. Life presents itself almost always in the form of discrete quanta—
individuals. But unlike the atoms of classical physics, biological individuals
are divisible into numerous, diversified, and highly organized components—
cells. Cells are, in turn, composed of numerous subcellular organelles (chro¬
mosomes, nucleoli, mitochondria, ribosomes, etc.), and the organelles are
built of numerous molecular species, arranged in complex patterns. The
supraindividual forms of integration seem less tangible in spatiotemporal
sense than the infraindividual ones, but they are equally interesting and sig¬
nificant. Mankind is less clearly perceived than an individual man, but it is
nevertheless a meaningful entity. Sexual reproduction makes individuals
discrete but interdependent components of reproductive communities—
Mendelian populations. Mendelian populations are in turn united by repro¬
ductive bonds into genetically closed systems—biological species. Species
form more or less interdependent ecological communities or ecosystems.
The subdivision of biology into subordinate disciplines is a matter of
convenience. The relation of the historically established and generally rec¬
ognized disciplines to the levels of biological integration can be envisaged
as follows:

Integration Level Disciplines


Ecosystem, community Community ecology, zoo- and plant
geography
Population Systematics, population ecology,
population genetics
Individual Morphology, physiology, behavior
study, embryology, develop¬
mental genetics, physiological
ecology
Cellular and cellular organelle Histology, cytology, cytogenetics,
cell physiology
Molecular Chemical physiology, biochemistry,
biophysics

Several biological disciplines are concerned with phenomena of more


than a single integration level—ecology, genetics, and physiology with three
each. This is a defect neither of the proposed classification of the levels nor
THEODOSIUS DOBZH AN SKY 171

of the delimitation of the disciplines in question. Bridging the gaps between


the integration levels is one of the functions of scientific study. The molecu¬
lar level is nowadays so often contrasted with those above it that it has
seemed desirable to suggest a collective name for these latter—the or-
ganismic level (Dobzhansky, 1964). It is convenient to speak of organismic
biology and molecular biology. Organismic biology is concerned with
studies on the organismic level, which include as sublevels everything above
the molecular level.
It is often alleged that the molecular level of biological phenomena is the
one in terms of which all other levels must be understood in “new” biology.
This is why molecular biology is boastfully styled “fundamental.” Indeed,
molecular biology has in our time achieved successes which are likely to be
regarded in the history of science among the most glorious achievements of
the whole scientific enterprise. Further advances may confidently be ex¬
pected, provided that our civilization does not commit suicide through a
nuclear war or a runaway population explosion. This raises anew serious
questions of scientific strategy. Is the Cartesian dream about to become re¬
ality, and the molecular level research about to make the phenomena of the
organismic levels explicable without studies directed specifically at these
levels? In other words, does Cartesian biology make Darwinian biology
superfluous?
Each biological integration level is a realm of phenomena and regularities
characteristic of that level. This does not mean that some novel elemental
forces are superadded at each step ascending from a lower level to the one
above it. To distinguish the integration levels we need not assume some sort
of multiple vitalism. The phenomena and regularities observed on each level
are patterns of components belonging to the underlying levels—a population
is a pattern of individuals, individuals of cells, cells of molecules.
Reduction of the phenomena of the organismic levels to those of the
molecular level, and further to chemistry and to physics, is, in principle,
possible. What is the meaning of this statement? Organic phenomena are,
indeed, patterns of chemical and physical components. There is no vital
force, no entelechy, no psyche. The point is, however, that understanding the
patterns is just as essential and exciting as understanding the components.
A mosaic picture consists of stones of various colors, but it is a pattern, not
a pile of stones. A living body is not a mixture of chemicals stirred together;
it is an integrated system which arose gradually during the two billion years
of organic evolution.
An article of faith current in some circles is that a complete knowledge of
the components will automatically reveal also the patterns which they com¬
pose. The validity of this faith hinges on the meaning of “complete knowl¬
edge.” Laplace thought that if one knew “all the forces acting in nature at
a given instant, as well as the momentary positions of all things in the uni¬
verse,” then an intellect “sufficiently powerful to subject all data to analysis”
172 2. STRUCTURE OF SCIENCE

could with perfect certainty predict all the future and “retrodict” all the past
events. But, as Nagel (1961) said, Laplace was “guilty of a serious non
sequitur,” because “such a claim would be warranted only if, in addition to
knowing these things, Laplace’s divine intelligence would be able to analyze
all traits of physical objects whatsoever ... as definable in terms of the
variables that constitute the mechanical state of a system. However, me¬
chanics does not rest on the assumption that such an analysis is in fact
possible.”
In biology, it is only rarely practicable to deduce or to predict the pat¬
terns from a description of the components. In point of fact, there is little
to be gained from such predictions. The reason is very simple—it is the
extreme complexity of organismic patterns. The strategy of biological re¬
search is to discover the patterns first, then the components down to the
molecular ones, and finally the functional and adaptive meaning, as well as
the evolutionary origins of the particular ways the components are com¬
bined in the patterns. Nor is this strategy peculiar to biology alone (except
that the queston of adaptedness is not relevant in inorganic nature). Water
is a compound of hydrogen and oxygen, and alcohol a compound of hydro¬
gen, oxygen, and carbon. Hydrogen and oxygen are gases, and carbon is a
solid which is black in charcoal and light in diamond. Water and alcohol
are colorless liquids with remarkably different properties, especially when
imbibed by man. These properties are sometimes described as “emergents,”
but this word has been abused so badly that it is no longer serviceable. The
point is simply that the properties of water and of alcohol are studied as
such, rather than deduced from the properties of the elements that compose
them.
An example may be helpful at this point to illustrate the application to
biology of the foregoing, perhaps overly abstract, discussion. Natural selec¬
tion is a population level phenomenon or pattern; what are the components
of this pattern? The essence of natural selection is, in the modern view,
differential reproduction of the carriers of different genotypes in a given
environment or a sequence of environments. The interaction of the cor¬
porate genotype (gene pool) of a population with the environments in
which the population lives may change the composition of the gene pool
(directional selection, or balancing and normalizing selection leading to¬
ward an equilibrium), or may conserve a given state of the gene pool
(balancing or normalizing selection at equilibrium). In any case, for
natural selection to occur there must exist two or more genotypes with
different Darwinian fitness in a given environment.
Darwinian fitness is sometimes referred to also as selective value and as
adaptive value. There is evidently an affinity between the Darwinian fitness
or adaptive value on the one hand and the adaptedness on the other. The
two concepts are, however, not identical. Since both are among the key
concepts in organismic biology, it is advisable to have their relationships
THEODOSIUS DOBZHANSKY 173

clarified, the more so because they are not infrequently confounded both in
philosophical and in biological literature (Beckner, 1959; Scriven 1959;
Grene, 1958, 1961; Goudge, 1961; Smart, 1963; Manier 1965). Darwinian
fitness is concerned with the rate of transmission of genes from generation
to generation. It is a measure of the reproductive effectiveness of a geno¬
type. It can be defined operationally as the average contribution which the
carriers of a genotype, or of a class of genotypes, make to the gene pool
of the following generation relative to the contributions of other genotypes
(Dobzhansky, 1962, 1964).
Like adaptedness, the Darwinian fitness is a function both of the
genotype and of the environment. A high adaptedness, as well as high
Darwinian fitness in a certain locality or a geographic region at the present
time, may become lower or higher in the future or in a different locality.
Health, hardiness, and vigor enhance the individual adaptedness (or “fit¬
ness” in the vernacular—here is a danger of another semantic muddle).
Yet these qualities will be reflected in a high Darwinian fitness only if they
lead to a high rate of transmission of the genes to the following generations.
The point is that while the adaptedness may be treated as an absolute
measure, Darwinian fitness is a relative one. Consider a genetically
uniform array of individuals, such as a bacterial clone. It may be well or
poorly adapted to the environment of a laboratory culture, as measured by
its rate of growth in the log phase or by its ability to survive in the
stationary phase. Its Darwinian fitness will become manifest however only
when another genotype is introduced into the culture or arises by mutation,
or when the growth of one culture somehow influences the conditions of
other cultures. A high adaptedness of one genotype may go together with
a relatively low Darwinian fitness, if this genotype is exposed to competi¬
tion with a still better adapted genotype.
Interesting differences between adaptedness and Darwinian fitness arise
from certain genetic conditions, such as chromosomal polymorphisms
maintained by heterotic balancing selection. In a population of Drosophila
polymorphic for certain chromosomal inversions, the Darwinian fitnesses
of the homokaryotypes are lower than that of the heterokaryotype. How¬
ever, experimental chromosomally monomorphic as well as polymorphic
populations can be maintained indefinitely in laboratory population cages.
The monomorphic ones build numerically smaller populations than the
polymorphic ones, given the same amount of food. In nature, polymorphic
colonies would outbreed the monomorphic ones, and eventually replace
them. This can happen, however, only if the monomorphic and polymorphic
colonies exchange migrants; if they are completely isolated, for example
on oceanic islands, the replacement need not occur.
Another interesting situation is the chromosomal “sex-ratio” variant
in Drosophila. A male carrying this genetic variant in its X-chromosome
produces daughters and few or no sons when crossed to any female. Since
174 2. STRUCTURE OF SCIENCE

daughters do and sons do not inherit X-chromosomes from their fathers, a


“sex-ratio” male transmits his X-chromosome to his entire progeny, while
a normal male transmits his X to only half of the progeny. Other things
being equal, the “sex-ratio” genotype would have a higher Darwinian fit¬
ness in a population than the normal one. (In reality this is not the case,
since females homozygous for the “sex-ratio” have a low fitness (Wallace,
1948.) Uncontrolled spread of the “sex-ratio” could result in a disaster
and extinction of the population; the population might come to consist of
females and no males, which in an organism incapable of parthenogenesis
would spell extinction. However, as long as the shortage of the males is
not acute enough to interfere with the reproduction, a population contain¬
ing “sex-ratio” in high frequency might have a high Darwinian fitness. A
collapse might occur rather suddenly.
The above examples should make it clear that both adaptedness and
Darwinian fitness describe the properties of individuals, genotypes, and
populations not only in a given environment but also on a given time level.
It has been pointed out above that adaptedness must be distinguished
from adaptability. Analogously, evolutionary changes leading to a high
Darwinian fitness may mean a narrow specialization to a given environment,
or to a limited range of environments, and a loss of the evolutionary
plasticity. This point has been lucidly analyzed by several authors, especially
by Thoday (1953, 1958), and recently by Manier (1966). Thoday’s
terminology is, however, different from that employed in the present article.
He defines the fitness of a “unit of evolution” (Mendelian population or
species) as “its probability of leaving descendants after a given long period
of time. Biological progress is increase in such fitness.” Therefore, “the
fit are those who fit their existing environments and whose descendants will
fit future environments.” Thoday’s “fitness” has several components, which
he designates as adaptation, genetic variability, genetic flexibility, pheno¬
typic flexibility, and stability of the environment. For a discussion of these
components, the reader is referred to Thoday’s papers.
Darwinian fitness, as this term is used in population genetics, is a
measure of the relative efficiency of the transmission of genetic variants
from generation to generation. Since generations follow each other in time,
the time dimension is inevitably introduced. In his discussion of “fitness”
Thoday stresses, however, the “long period,” such as 108 years. This limits
the operational usefulness but not necessarily the conceptual value of the
term. Darwinian fitness has been measured experimentally in favorable
materials, such as chromosomal polymorphisms in Drosophila, and the
number of such measurements is growing rapidly. Thoday’s fitness refers
rather to the evolutionary perspectives of a species, its permanence,
security, and progress. Evolutionary progress has proved to be an extraor¬
dinarily elusive concept, although biologists have for a long time felt that
some evolutionary changes deserve being called progressive, others regres-
THEODOSIUS DOBZH AN SKY 175

sive, and still others neutral in this respect. Attempts to formulate explicit
criteria of progressiveness have often been made but rarely succeeded;
Thoday’s is perhaps the most nearly successful one, and yet it too stands in
need of improvement.
It has been rightly pointed out, both by philosophers and by biologists,
that the theory of evolution has little predictive power. Indeed, at the
present level of our knowledge, long-term predictions of evolutionary
events are extremely hazardous. The following may, however, serve as ex¬
amples. Consider these three zoological species: the grizzly bear (Ursus
horribilis), the Norway rat (Rattus norvegicus), and man (Homo sapiens).
The adaptedness of the bear is restricted to a rather narrow range of the
existing environments, that of a rat to a wider one, and that of man to the
widest, because he is able to tailor environments according to his needs.
There is not much point in trying to compare the Darwinian fitnesses of
the three species, since they rarely compete for the same ecological niches.
It is, however, a not implausible conjecture that the grizzly has the
greatest and man the smallest probability of becoming extinct in, say, one
thousand years from now (a much smaller number than that mentioned
by Thoday!). This conjecture is based evidently on an extrapolation to the
future of the environmental changes which have been taking place on earth
over the last several centuries.
The grizzlies will either be exterminated or will continue in small num¬
bers in nature preserves. The perspectives of the rat are brighter, since at
least until now this species has shown both a remarkable adaptedness and
an adaptability. There is, however, a possibility that man may discover a
way so to modify the rat’s habitats that the toleration limits of the rat
species will be exceeded. In fact, a drug denoted McN-1025 is highly toxic
to rats but not to other mammals (Roszkowski, Poos, and Mohrbacher,
1964). Man as a species is most likely to endure, and is in this sense the
“fittest.” This prediction has, however, an unknown margin of uncertainty;
it is not quite inconceivable that some virus disease may not be controllable
and may destroy the species, or that our species will commit suicide by
means of an atomic war, population explosion, or like madness.
What, then, is the usefulness of the Darwinian, compositionist, forms of
explanation in biology? They do not compete with or replace the Cartesian,
reductionist, explanations; do they add something important? They do,
because the two types of explanation deal with two aspects of biological
phenomena. Either one by itself may or may not be valid, but it will cer¬
tainly be incomplete. We observe certain spatiotemporal systems which
we call living organisms, and for the purposes of the present discussion we
shall assume that we know how to distinguish the living from the nonliving.
Now, the investigation of the living and the nonliving systems proceeds, to
begin with, the same way. One examines their components, the movements,
processes, and changes which are taking place in them. The description
176 2. STRUCTURE OF SCIENCE

so obtained may be complete enough and satisfactory for the nonliving


systems, but for the living ones this is only a part of the story.
It has been pointed out above that an organism which is alive in a given
environment is therefore adapted to that environment. There exist, however,
different kinds, degrees, and manners of adaptedness. How long does the
organism remain alive in this environment? Is it able to grow, develop,
mature, and reproduce in the environment in question? In how wide a
range of other environments can it live, develop, and reproduce? What
particular traits, structures, functions, and physiological processes make
or could make it adapted to survive longer, or make its reproduction more
or less successful in a particular environment or a succession of environ¬
ments? What are the characteristics of a given species which qualify it for
membership in a particular ecosystem?
Different organisms, variants of a single species or different species, are
often inhabitants of the same or similar environments in the same territory.
This raises a host of problems concerning not only their adaptedness but
also their Darwinian fitness. What form of natural selection permits these
variants to coexist in the same population in the same territory? Are they
likely to coexist indefinitely, or are some of them on the way to establish¬
ment and others to elimination? What genetic or what environmental
changes can alter the balance of the selective forces? Bacterial spores survive
at temperatures too high and too low for growth and reproduction. We can
measure how long they remain alive at a given temperature, and may in¬
vestigate the characteristics which make one variant more resistant than
others. Another inquiry may determine the “innate capacity for increase”
of these variants (Andrewartha and Birch, 1954). This measures how
rapidly one genotype or population could increase in numbers under con¬
ditions where food, places to live in, and other necessities for life are not
limiting. Still another study will disclose the relative positions of the
variants under conditions that are limiting or over a range of the environ¬
ments to which these variants are exposed in the course of, for example,
the annual succession of seasons.
The origins and histories of the living systems which we observe on
our time level must also be investigated. Historical problems are certainly
not restricted to biology. Nonliving systems, such as suns, planets, and
mountain ranges, also have had their origins and histories. Modern science
is time conscious (Toulmin and Goodfield, 1965). The biologist’s bias is,
however, that life is more interesting than nonlife. This leads him to inquire
how the organisms were able to withstand the insults of their environments,
how they have developed their present adaptedness, or, on the contrary, how
and why they became extinct without issue, as many of the organisms of
the past did.
The array of factors which contribute to the operation of natural selec¬
tion, or which bear witness to its efficacy, is a tremendous one. Here we
THEODOSIUS DOBZH AN SKY 177

are concerned with phenomena of organismic biology, and chiefly of the


individual integration level. The chances of an organism to survive, and to
leave a progeny which reaches the reproductive stage of the life cycle, de¬
pend on a multitude of morphological, physiological, and behavioral features
of the carriers of a given genotype, as well as on the environments that
the organism encounters. Which traits are advantageous or disadvantageous,
and in what environments, is evidently different in different organisms. A
great wealth of relevant data has been accumulated in descriptive and
experimental zoology and botany; these data are most interesting and
meaningful when viewed from the standpoint of what makes the organisms
concerned fit or unfit to survive and to reproduce in their environments.
In the light of natural selection many otherwise paradoxical facts be¬
come intelligible. Together with remarkable adaptations organisms often
show astonishing imperfections. Can anything be adaptively more pre¬
posterous than the painfulness and hazard of childbirth in man? The puzzle
becomes less puzzling if it is realized that natural selection operates not
with fit and unfit traits in isolation, but with phenotypes as wholes and with
genotypes that produce the phenotypes. Difficult childbirth may have arisen
as a part of the constellation of changes which gave man his erect posture
and his hands able to make and use tools. The constellation as a whole is
certainly adaptive, and its Darwinian fitness is high because its defects are
presumably compensated for by its advantages.
Reproductive capacity is an important component of the Darwinian
fitness; does it follow that natural selection must work to increase in¬
definitely the number of the progeny produced? Lack (1954) has shown
that in at least some species of birds the mean number of surviving young
per nest is greatest in nests with a certain modal number of eggs. With
too many eggs the parents are unable to take care of their progeny, and
the net progeny size diminishes instead of increasing. Increases of fecun¬
dity would then, be discriminated against by natural selection.
Natural selection is predicated on the availability of a store of genetic
variants. The origin of the components of this store, of the genetic raw
materials with which the selection works, is an important problem. The
present evidence shows that the genetic raw materials arise through mu¬
tation and gene recombination. These are phenomena of the cellular and
molecular levels. Mutation is known to be a complex of several rather
disparate processes. Some mutations result from changes in the internal
structure of the affected genes, others from relatively cruder mechanical
alterations in the architecture of chromosomes or chromosome parts.
Changes in the self-reproducing components of the cytoplasm are rela¬
tively little known. Gene mutations, or at least some of them, represent
substitutions, or additions or subtractions, of single nucleotides in the
DNA chains. Such mutations result in substitutions of usually single amino
acids in the proteins specified by the genes concerned. The chromosomal
178 2. STRUCTURE OF SCIENCE

mutations are duplications, deletions, translocations, or inversions of blocks of


genes, or reduplications or losses of whole chromosomes or chromosome sets.
Gene recombination is a corollary of the process of meiosis, and meiosis
is a corollary of sexual reproduction. In many lower organisms meiosis and
sex have not developed or have been lost secondarily; in some of these
organisms gene recombination can nevertheless take place, through para-
sexuality, transduction, transformation, and perhaps other yet to be dis¬
covered mechanisms. That genes recombine is important because the effects
of genes interact in the development. The notion which still lingers among
some biologists, that each gene produces a portion of the phenotype inde¬
pendently of other genes, is erroneous. Perhaps closer to reality is to describe
the phenotype as a gestalt resulting from a concerted operation of the whole
genotype. Recombination creates novelty.
Naturalists and organismic biologists are old-fashioned, in the sense
that they are direct inheritors of an old and venerable intellectual tradition
—the Darwinian tradition. So are molecular biologists, inheritors of an
even older and no less venerable Cartesian tradition. Molecular biology
has in our day achieved successes which the advances of organismic
biology during the same period do not equal. What about the future?
Forecasts are always fallible in such matters, but one has to do one’s best
to steer the research activity in what seem to be profitable directions. At
present both molecular and organismic biology seem ready for major
new advances. The prime consideration here should perhaps be that they
should advance not separately, not in isolation, but together, in cooperation.
A British politician opined that in politics scientists “should be on tap,
not on top”; in biology, molecular and organismic biologists should be on
tap for each other, and neither should be on top of the other.
One of the most encouraging characteristics of biology today is its in¬
creasing unification and integration. The unification proceeds in the face
of the ever accelerating growth and the consequent need of specialization.
How is specialization compatible with unification? The answer lies, I think,
in the great unifying principles, the Cartesian reductionism and the Dar¬
winian compositionism. Certainly neither of them is new, but it is only
now that they are beginning to be generally understood and accepted by
most biologists. Not all biologists even at present accept their full impli¬
cations. There is some “academic lag” still to be overcome!

NOTE
1. The present article is an expanded and modified version of the one entitled
“Are naturalists old-fashioned?” published in the Centennial Volume of The Ameri¬
can Naturalist in 1966. A still different version is published in Volume 2 of Evolu¬
tionary Biology (T. Dobzhansky, M. K. Hecht, and W. C. Steere, eds.), Appleton-
Century-Crofts, New York, 1968. I have profited greatly from discussions with my
colleagues, Drs. R. J. Dubos, H. G. Frankfurt, and Z. Kochanski, and with other
members of the Philosophy of Biology Study Group, organized by Dr. Frankfurt at
the Rockefeller University.
REDUCTION: Ontological and Linguistic Facets1
Carl G. Hempel

1. INTRODUCTION

Among the many issues in the logic and methodology of science to whose
clarification Ernest Nagel has contributed by his thorough and illuminating
analyses, the subject of theoretical reduction holds a place of special
philosophical interest. Reduction affords an important mode of scientific
explanation; it raises intriguing questions concerning the relation between
the concepts of the reducing and of the reduced theory; it is a procedure that
reaches across the boundary lines of scientific disciplines; and its potential
scope encompasses all the major branches of empirical science, since prob¬
lems of reducibility arise in fields as diverse as physics and chemistry,
biology, psychology, and the social and historical disciplines. All these
aspects lend philosophical interest and importance to the topic of reduc¬
tion. But one further reason for the fascination the subject has held for
philosophers lies, I think, in the ontological roots of many questions con¬
cerning reduction—questions such as these: Are mental states nothing
else than brain states? Are social phenomena simply compounds of indi¬
vidual modes of behavior? Are living organisms no more than complex
physicochemical systems? Are the objects of our everyday experience
nothing else than swarms of electrons and other subatomic particles? Or is
it the case, as the doctrine of emergence would have it, that as we move
from subatomic particles to atoms and molecules, to macroscopic objects,
to living organisms, to individual human minds, and to social and cultural
phenomena, we encounter at each stage various novel phenomena which
are irreducible, which cannot be accounted for in terms of anything that
is to be found on the preceding levels?
These questions appear to concern ontological issues, namely, the basic
identity or difference of various kinds of empirical states and processes
which, prima facie, exhibit striking differences. Yet in philosophical studies
in the past several decades, the problems of reduction have generally been
given a decidedly linguistic turn. Thus, that chapter in Ernest Nagel’s book
The Structure of Science which deals with the concept of reduction charac¬
teristically bears the title “The Reduction of Theories” and examines the
relations that obtain between the terms and the laws of two theories when
179
180 2. STRUCTURE OF SCIENCE

one of them is reduced to the other. The linguistic construal of reduc ion
goes back at least to early logical empiricist studies, especially to the work
of Carnap and Neurath on unitary languages for empirical science and *o
Carnap’s subsequent emphasis on the possibility and importance of stating
all problems of the philosophy of science in the “formal mode rather than
in the “material mode of speech” or in “pseudo-object-sentences. ~ The
view then held by Carnap, that all significant issues in the philosophy of
science can be restated so as to concern exclusively the syntax of the
language of science, is much too restrictive, as Carnap himself has long
since noted;3 but a construal that is linguistic in a wider sense has proved
remarkably fruitful—so much so, in fact, that many recent and current
studies of reduction characterize the subject from the very beginning as one
concerning the language of science rather than the “nature” and the inter¬
relationships of various kinds of entities, states, and processes. This linguis¬
tic turn is indicative of philosophical misgivings about the ontological con¬
strual of the issues in question, and it is an attempt to explicate the latter
by restating them in a clearer and philosophically more satisfactory fashion.
But the reasons for rejecting the ontological versions and for construing
problems of reduction as concerning the language of science are not, as a
rule, made very explicit. It may be of some interest, therefore, to reflect
upon the rationale of this linguistic turn and to explore some of its impli¬
cations for current problems in the theory of reduction. This I will attempt
to do in the present essay.

2. ON MECHANISM IN BIOLOGY

As an example of an originally ontological view that can be construed


as a reductionist thesis, consider the mechanistic conception of biological
phenomena, according to which all biological systems, states, and processes
are basically nothing else than complex physicochemical systems, states, and
processes, and thus are governed by purely physicochemical laws and ex¬
plainable in terms of these.
Much like the statement that the morning star is the same thing as the
evening star, this thesis presupposes the possibility of making at least a
conceptual distinction between the two classes of phenomena that it declares
to be basically of the same kind. How, then, are biological systems, states,
and processes to be conceptually distinguished from physical, chemical,
psychological, or sociological systems and occurrences? Let us note first
that objects, states, and events cannot be unambiguously divided into mu¬
tually exclusive classes of “physical entities,” “chemical entities,” “biological
entities,” and so forth; for any individual object or event—the sun, the At-
CARL G. HEMPEL 181

lantic Ocean, the periods of the Black Death in London, the assassination of
President Kennedy—can become a subject of investigation for many dif¬
ferent disciplines of empirical science. These disciplines, however, are con¬
cerned with different aspects of the phenomena in question; and the aspects
of concern to a particular discipline will be characterized by means of a
conceptual apparatus, and a corresponding vocabulary, that is distinctive
of the discipline. Thus, an object or event may be qualified as physical,
chemical, biological, etc., only relative to a particular characterization, i.e.,
according to the vocabulary in terms of which it is described. More pre¬
cisely: the labels “physical,” “chemical,” “biological,” “psychological,”
and so forth apply properly not to particular objects or events but to ways
of characterizing them. The same remark applies also to kinds or classes of
things or events. A class of things that can be described in biological
terms may also admit of a description in physicochemical terms; for ex¬
ample, epinephrine, biologically characterized as a hormone secreted by
the medulla of the adrenal gland, has been chemically identified as amino-
hydroxyphenylpropionic acid, a substance characterized by a structural
formula for its molecules. Thus, the distinction of biological from physical,
chemical, and other kinds of items applies to individual things and events
and to kinds or classes of things or events only “under a specific descrip¬
tion,” i.e., only in so far as they have been characterized by means of a
terminological apparatus distinctive of a certain scientific discipline. Rather
than of physical, chemical, biological, etc., phenomena, we should speak of
physical, chemical, biological characterizations of phenomena. A biological
characterization would be one that contains essential occurrences of bio¬
logical terms (i.e., is not logically equivalent to one that contains no bio¬
logical terms). But it may contain physical and chemical terms as well:
a statement describing certain thermodynamic or chemical aspects of bacte¬
rial metabolism, for example, would surely count as characterizing biologi¬
cal phenomena. Terms from the specific vocabularies of psychology or the
social sciences, on the other hand, may be thought of as excluded from the
language of biology; expressions containing such terms may plausibly be
said to characterize psychological or sociological phenomena. (This con-
strual accords with the idea that the main disciplines of empirical science
can be put into an order in which each “presupposes” the conceptual ap¬
paratus of its predecessors, but not of its successors; the order would be
this: physics, chemistry, biology, psychology, social sciences. This idea
raises various questions of detail, which are not, however, relevant to the
issues here under discussion.)
The considerations just outlined argue for a linguistic construal of the
mechanistic conception of biological phenomena. For the thesis that all
biological systems and processes are basically physiochemical, the follow¬
ing tentative reformulation suggests itself: Every aspect of a thing or event
182 2. STRUCTURE OF SCIENCE

that is characterizable by means of biological (and possibly physical and


chemical) concepts can be characterized in terms of physical and chemical
concepts alone. This, in turn, might be construed more specifically as assert¬
ing one of the following theses:

(Mia) Every biological term can be defined by means of physical and chemi¬
cal terms;
(MU) For every biological term, there is a coextensive expression containing
only physical and chemical terms.

Similarly, the mechanistic claim that all biological phenomena can be


explained by physicochemical principles might tentatively be construed as
asserting:

(M2) Every biological law can be derived from purely physicochemical laws.

These explications, however, are inadequate in several respects. First,


mechanism surely does not mean to make the claim M1 for every biological
term ever used; in particular, it does not cover terms characteristic of
theories that have been discarded as empirically false or as untenable on
other grounds. Mechanism cannot be saddled with the view that expressions
such as “vital force” or “animal magnetism” are definable in physicochemi¬
cal terms. Its claim refers only to “actual” aspects of biological phenomena;
and if this idea is given a linguistic construal, it asserts that any term oc¬
curring in a true biological theory is definable or coextensively characteriz¬
able by means of terms occurring in true physical or chemical theories.
Similarly, the biological laws referred to in M2 must be understood as
“actual” biological laws, i.e., as true lawlike sentences containing essen¬
tial occurrences of biological (but not of psychological or sociological)
terms. This suggests the following revised explication of mechanism:

(M') Every true biological theory B is reducible to some true physicochemi¬


cal theory P in the sense that
(MT) every term of B is definable or coextensively characterizable
by means of some terms of P; and
(M'2) Every sentence of B is derivable from some sentences of P.

This formulation comes closer to the intuitive intent of the mechanistic


view, but at the price of introducing serious obscurities. For the phrase
“every true biological theory” must be understood to refer not only to theo¬
ries actually formulated at one time or another in the past, present, or future,
but simply to any conceivable true biological theory, whether or not it is
actually ever thought of and formulated; and this evidently is an extremely
elusive idea. Our reformulation is unsatisfactory also because it assumes
the possibility of dividing all conceivable true theories into physicochemical,
CARL G. HEMPEL 183

biological, and other theories; and this assumption is no less questionable


than that of a general distinction between physicochemical, biological, and
other kinds of systems, as presupposed by the ontological version of mech¬
anism. Thus, the linguistic turn effected in M' does not advance philo¬
sophical clarification.
Thus, we face a dilemma. If, for the sake of conceptual clarity, we give
the thesis of mechanism a linguistic turn, we fail to express its philosophical
intent; and if we try to formulate that intent in ontological terms, or in the
“material mode,” the resulting statement proves to be seriously obscure and
elusive.
The same difficulty besets all reductionist theses that are conceived as
ontological claims. I will try to illustrate this now by another example,
which has recently received a great deal of philosophical attention.

3. ONTOLOGICAL AND LINGUISTIC CONSTRUALS OF


PSYCHOPHYSICAL ASSOCIATION

The reductionist thesis I wish to consider has been said to express the
empirical component of the psychophysical identity theory. As Shaffer
puts it:

Identity Theory rests on an empirical hypothesis. It hypothesizes that for each


particular mental event there is some particular physical event which always
occurs and is such that whenever that physical event occurs then the mental
event occurs. (The theory proposes to explain this by the further hypothesis
that the physical event and the mental event turn out to be one and the same
event.)4

I shall refer to the “empirical hypothesis” noted in this passage as the


hypothesis of universal psychophysical association, or briefly, as the hy¬
pothesis U. The mention made in U of “particular events” evidently is not
meant to refer to individual physical and mental events, for these have
definite temporal locations and therefore cannot recur in the specified
manner; the hypothesis must rather be taken to refer to kinds or types of
mental and physical occurrences. Moreover, the physical events are under¬
stood to occur in the body of the individual who experiences the mental
events. The hypothesis may therefore be restated as follows:

(U) For every kind M of mental state or event there exists a kind B of
bodily state or event such that a person experiences a state or event
of kind M if and only if a state or event of kind B occurs in his body.

But it seems that this statement can be shown to be true without benefit
of empirical evidence. For let us adopt the following definition:
184 2. STRUCTURE OF SCIENCE

(D) The body of a person will be said to be in state M' if and only if the
person is in mental state M. (Similarly for events; I will henceforth
refrain from always spelling this out.)

If bodily states of this kind are taken into account, hypothesis V be¬
comes trivially true. It may seem natural to say that states defined by D
do not properly qualify as bodily states; but if the notion of mental state
is taken to be intelligible, as U plainly does, then definition D must surely
be said to specify various states that can intelligibly be attributed to a
human body.
It will be necessary, therefore, to supplement U by specifying the kinds
of bodily (or of physical) states and events to which the hypothesis is
meant to refer. And just as in the case of the mechanistic thesis, this will
have to be done by indicating the conceptual apparatus, and the associated
vocabulary, in terms of which bodily states and events are to be character¬
ized. In the context of the mind-body problem, the conceptual framework
that serves to specify bodily states or brain states will normally be meant
to comprise not only physics but chemistry and biology as well.
The reference made in U to “every kind of mental state” requires anal¬
ogous qualifications. This need is made clear by the arguments already
presented; but it might be further illustrated by this example: Suppose
that a physico-chemical-biological theory has been specified in terms of
whose concepts the bodily states mentioned in U are to be characterized.
And suppose further that P is a certain physicochemical feature (such as,
perhaps, the occurrence of rapid small oscillations of the average brain
temperature) which is describable in the specified theory and which, as a
matter of empirical fact, satisfies the following condition: A person’s brain
has the feature P if and only if the person “is in a pain-state,” i.e., feels
some pain or other. In this case, it would be correct to say that every mental
state which is a pain-state has a corresponding kind of brain-state associated
with it in the manner envisaged by the hypothesis U. However, the associ¬
ated brain-state would be of the same kind, namely, the kind characterized
by the presence of feature P, for all pain-states, irrespective of the qualities
and intensities of the pain; whereas the hypothesis of universal psycho¬
physical association is surely meant to imply that mental states of different
kinds are associated with bodily states of different kinds. And indeed, for
an opponent of a reductive identification of the mental with the physical, any
difference between mental states that was not matched by a difference be¬
tween the associated brain states would constitute evidence that mental
phenomena could not be completely reduced to brain states. But when
do two mental states count as different in a sense relevant to the hypothesis
U? Do my toothache in the morning and my headache in the evening
count as mental states of the same kind, since both are pain-states, or do
CARL G. HEMPEL 185

they count as different kinds because of the phenomenally different locations


of the pain? Or take my toothache in the morning and my toothache in the
evening: suppose that they are indistinguishable for me in terms of location,
quality, and intensity, yet the later one is accompanied by a feeling of
fatigue—does this qualify them as mental states of different kinds in the
context of the claim made by U1 What kinds of occurrences count as mental
states or events, and what features count as establishing differences among
them, evidently depends on our changing conceptions of mental phenomena.
Being possessed by a demon would not count today as a kind of mental
state, whereas being under hypnosis might. And if extrasensory perception
came to be well established, then certain kinds of, and differences between,
mental states might be theoretically countenanced which currently are not,
such, perhaps, as types and degrees of extrasensory receptivity.
If U is to be given a clear meaning, therefore, the scope of the notions
of mental event and mental state must be indicated. Ideally, this would have
to be done by specifying a psychological theory whose conceptual frame¬
work would then settle these matters: different mental states would be
those distinguishable in terms of the concepts of the theory.
As a result, the distinction between bodily and mental states which under¬
lies the hypothesis U and related views comes to be construed as the dis¬
tinction between states characterized in terms of the concepts of physico¬
chemical-biological theories, and states characterized with essential use of
psychological terms. This, I think, provides strong grounds for giving the
psychophysical hypothesis in question a linguistic turn.
For those who hold the identity theory, there is a further reason to
espouse such a construal. For if every mental state is identical with a
physical state, then there can be no empirical characteristics that differ¬
entiate mental states from physical states: the distinction will rather pertain
to alternative ways of characterizing “the same” state—in terms of phys¬
ical, chemical, and biological concepts, or in terms of psychological
(specifically, mentalistic) ones; thus, the distinction will concern states-
under-a-theoretical-characterization.
On this understanding, U would have to be construed as asserting,
roughly, that for every characterization of a state or event that makes
essential use of psychological terms, there is a coextensive one expressed
in physical, chemical, and biological terms alone. Considerations analogous
to those presented earlier suggest that since U refers to all possible mental
states, a reformulation would have to assert this: For every mental-state
characterization expressible in a true psychological theory, there exists a
coextensive bodily-state characterization in some true physico-chemical-
biological theory. But while this reformulation comes close to the intent
of the ontological view it is to paraphrase, it clearly is problematic in es¬
sentially the same respects as the latter.
186 2. STRUCTURE OF SCIENCE

4. THE LINGUISTIC-ONTOLOGICAL DILEMMA FOR


UNIVERSAL DETERMINISM

It may be of interest to note, by way of a brief digression, that the


thesis of universal determinism is subject to a similar difficulty. To see
this, let us recall that a theory is called deterministic with respect to systems
of some particular kind if, for any such system, it specifies (a) a set of mag¬
nitudes called variables of state, whose values at a given time are said
fully to characterize the state of the system at that time, and (b) a set of
laws which, given the state of such a system at some time t0, mathematically
determine the state of that system at any other time. Newton’s theory of
gravitation and of motion is a deterministic theory with respect to any
isolated system of point masses exerting only gravitational forces upon
each other—provided that the notion of state for such a system is limited
to the positions and momenta of the constituent point masses. In astro¬
nomical applications of the theory, the point masses are represented by
astronomical bodies that are small by comparison with their distances; and
the theory has been used with great success to compute earlier or later
states on the basis of a given one. Astronomical objects may also change,
of course, in many respects other than position and momentum: in mass,
temperature, surface texture, internal structure, magnetic characteristics,
and so forth. On these, however, the Newtonian theory has no bearing,
for they cannot be expressed in terms of its variables of state; and New¬
ton’s theory is deterministic only with respect to its specified, quite narrow,
concept of state. Systems covered by a deterministic theory may then be
called deterministic systems with respect to the relevant notion of state.5
Universal determinism may then be viewed as an ontological thesis to
the effect that the entire universe is deterministic in all respects, that it forms
a deterministic system with respect to a concept of state sufficiently com¬
prehensive to permit a full characterization of every empirical feature of
the universe at a given time. More explicity, this would come to the asser¬
tion that there is a true theory which is deterministic and whose concept of
state is so rich that all aspects of the universe at any given time—physico¬
chemical, biological, psychological, sociocultural, etc., and also aspects of
kinds that we may be entirely unaware of—can be fully specified, down
to the last detail, by a suitable assignment of numerical values to the
variables of state of that theory.
Under this boundlessly inclusive construal of the total state of the uni¬
verse, however, determinism turns into a trivial truth. For it is surely one
of the characteristics of the state of the universe at time f0 that at a certain
other time tx—i.e., at a specified interval before or after t0—it displays such
and such features. Accordingly, a complete specification of the state of the
CARL G. HEMPEL 187

universe at t0 would have to include a full characterization of the universe at


all times; and thus, the total state of the universe at one time would trivially
determine its total state at all other times—all those states being identical.
Any significant statement of universal determinism requires, therefore, a
suitable limitation of those features that are to count as constituting the
state of the universe at any given time.
Here, then, is another example of the contrast between a conception ex¬
plicitly concerned with systems and states characterizable by means of the
conceptual apparatus of a specified theory and a related, self-defeatingly
comprehensive ontological idea. And as in the cases of mechanism and the
thesis of universal psychophysical association, a linguistic turn of de¬
terminism, involving relativization with respect to some theory, however
rich, cannot preserve the full inituitive intent of its ontological conception.

5. REDUCTION OF THEORETICAL PRINCIPLES

Although the ontological theses of mechanism in biology and of universal


psychophysical association have proved to be obscure in important re¬
spects, they are by no means totally unintelligible; in fact, they might be
regarded as overextensions of much more restricted and intelligible asser¬
tions concerning the reducibility of the concepts and principles of some
particular scientific theory to those of another. How the relevant notion
of reduction might be construed is suggested in part by the preceding dis¬
cussion. Let us consider a few of the details.
The reduction of the principles, or statements, of one theory, for ex¬
ample, a biological theory B, to those of another, such as some physico¬
chemical theory P, would consist in the derivation of the statements of B
from those of P. But a statement of B contains terms of the characteristic
vocabulary, VB, of B\ whereas the statements of P do not. The derivation
of the former from the latter will therefore normally require additional
premises in the form of statements that contain terms from the vocabularies
of both B and P. It is possible, to be sure, to deduce from any sentence
containing only physicochemical terms various sentences which contain
biological terms essentially (i.e., which are not logically equivalent to
sentences not containing those biological terms). For example, the sentence

(SO When sodium chloride crystals are put into water, they dissolve

logically implies
(So) When sodium chloride crystals are put into water they dissolve or turn
into bacteria;

and the statement


(S3) Any body falling freely near the earth moves with constant acceleration

implies
188 2. STRUCTURE OF SCIENCE

(S4) Any living body falling freely near the earth moves with constant ac¬
celeration.

But these reductive derivations are not specifically biological, even though
each of the derived sentences S2 and S4 contains an essential occurrence
of a biological term. For if that term is replaced by any other term of the
same logical type—whether from the vocabulary of biology or from that
of some other discipline—the resulting sentence will equally follow from the
given physicochemical premise. Thus, if in S2 the word “bacteria” is re¬
placed by “fish,” by “nonbacteria,” or by “iron filings,” the resulting sen¬
tence still follows from S4; and similarly, S3 also implies the sentences ob¬
tainable from S4 by replacing “living” by “nonliving,” “iron,” “spherical,”
etc.
It is the absence of this peculiarity which characterizes what might be
called reductions specific to biology. In these, the derivation of a biological
principle from physicochemical ones must make use of additional premises
which contain biological as well as physicochemical terms essentially, and
which thus establish connections between the two theories. Following
Ernest Nagel’s similar usage,61 will refer to them as connecting principles.
For example, the chemical analysis of photosynthesis and cellular respira¬
tion in plants permits a reduction of some uniform features of these
phenomena to physicochemical laws. Such reduction makes use of con¬
necting principles broadly to the effect that the plant cells involved in
photosynthesis contain certain characteristic chemicals, including chlo¬
rophyll; that the latter is a mixture of two substances of such and such
molecular structures; and so forth. To this information, principles of physics
and chemistry can then be applied to account for certain physicochemical
processes that take place in the cells and for the production of energy by
oxidation, which is the principal aspect of cellular respiration. In its barest
logical essentials, the structure of the resulting reductive accounts may be
schematically characterized as follows: A law or theoretical principle
couched in terms of physicochemical theory

(x) (<2ix D Q2x)

is combined with connecting principles

U) (R4x D Q4x)
(x) (Q2x D R2x)

to derive a biological law

(x) (RiX D R2x)

The reduction of a theoretical principle of B to principles of P thus does


not require a definitional or quasi-definitional reduction of the relevant
biological terms; the connecting principles need not provide full definitions
nor necessary and sufficient conditions of application for the biological
CARL G. HEMPEL 189

terms in physicochemical language; statements expressing necessary or


sufficient physicochemical conditions for biological concepts may well
suffice.
The discovery of such principles is a matter of scientific research; and
the connecting principles that are presently available do not, of course,
even remotely suffice to reduce all the laws and theoretical principles of
current biological theory to those of current physicochemical theory.

6. REDUCTION OF THEORETICAL TERMS

Even a set of connective principles powerf ul enough to permit a reduction


of all the principles of a biological theory B to those of a physicochemical
theory P need not permit a corresponding reduction of concepts; i.e., it
need not provide a characterization of every concept of B by means of
concepts of P. And only if this, too, had been accomplished could the
biological theory be said to have been reduced to a chapter of the physico¬
chemical one. For short of this, the reduction of B to P would amount to
the establishment of a set of connecting principles which, by our earlier
criterion, would constitute additional biological laws and would thus simply
expand the theory B, but would not make its conceptual apparatus dispen¬
sable.
Full reduction of concepts in this strict sense would require, for every
term of B, a connective law of biconditional form, specifying a necessary
and sufficient condition for its applicability in terms of concepts of P alone.
Such a law could then be used to “define” the biological term and thus,
theoretically, to avoid it. The “definition” of the temperature of a gas as the
mean kinetic energy of its molecules illustrates this kind of connective
principle. Examples pertaining to biological concepts are provided by the
connective principles characterizing such substances are chlorophyll and
various hormones in terms of their molecular structures. These principles
do not, of course, express synonymies ascertained by meaning analysis,
but empirical connections ascertained by chemical analysis. In fact, the
biological terms and the corresponding chemical characterizations are not
even strictly coextensive if the former are taken to refer to substances found
in certain kinds of cell or produced by certain kinds of gland: the chemical
“definition” of the biological terms then involves a broadening of their
extensions, with the effect of covering also certain substances synthesized
in the laboratory.
As was noted in Section 2, the possibility of such reduction of biological
concepts is of crucial importance to mechanism; for the claim that all—or
even certain specified—biological characteristics “are” basically physico¬
chemical implies at least that they can be extensionally characterized in
physicochemical terms. But as in the case of the theoretical principles, the
190 2. STRUCTURE OF SCIENCE

reduction of all concepts of current biological theory (or of the terms


standing for them ) to those of current physics and chemistry is not remotely
in sight. This fact does not, of course, lend support to such specific anti-
mechanistic conceptions of life as neo-vitalism, which fails to meet the
most basic requirements concerning the formulation and testability of
scientific theories, or holism and emergentism, which involve various logical
and methodological confusions—including an implicitly ontological con-
strual of reduction—as Ernest Nagel has convincingly shown.7
The reasoning outlined in this section and in the preceding one can be
applied analogously also to the questions concerning the reducibility of the
concepts and laws of psychological theories to the concepts and laws of
physicochemical and biological theories. Both kinds of reduction require
the establishment of appropriate connective principles, and whatever prin¬
ciples of that kind may be currently available are very far indeed from
providing a basis for a wholesale reduction.

7. REDUCTION AND SCIENTIFIC CHANGE

Interest in the concept of theoretical reduction has recently increased


as a result of the debate among philosophers and historians of science
about the logical aspects of the process of scientific change in which one
theory comes to be replaced by another. According to a familiar conception,
the old theory will often be linked to the new one by a reductive relation¬
ship. Thus classical thermodynamics is said to have been reduced to sta¬
tistical mechanics; geometrical optics to wave optics; Galileo’s and Kepler’s
laws to Newton’s theory of motion and of gravitation, and so forth.
But our earlier characterization of reduction does not strictly fit these
cases; for, as has been stressed by several writers, in none of these standard
examples does the supposedly reducing theory imply the principles of the
supposedly reduced theory; on the contrary, it contradicts them. Newton’s
theory implies that the acceleration of a body in free fall is not constant,
as asserted by Galileo’s law, but increases throughout the period of fall;
and that when several planets move about the sun, their orbits will show
perturbational deviations from an elliptic shape. Wave optics implies that
light does not always travel in straight lines; and the kinetic theory is incom¬
patible with the second law of thermodynamics in its strict classical form.8
Thus, the notion of deductive reducibility of laws does not fit the logical
relationship between such successive theories.
Indeed, several philosophers and historians of science have argued that
the idea of deductive reducibility reflects a fundamentally mistaken con¬
ception which views scientific change as a cumulative process in which each
new theory incorporates, and thus deductively implies, the knowledge
CARL G. HEMPEL 191

embodied in its predecessor; whereas in fact, according to this view, the


transition to a new theory involves the replacement of one conceptual
scheme and its characteristic set of theoretical principles by another one
that is incompatible and indeed incomparable with it, and whose concepts,
even if represented by terms from the vocabulary of the preceding theory,
have new and different meanings.9
This is not the place for a detailed examination of this conception of
scientific change; but I do wish to comment here on some of Feyerabend’s
ideas on the subject, which are directly relevant to our topic. Feyerabend
notes the logical incompatibility of many scientific theories with their
predecessors and then comments on the consequences of this fact for the
meanings of the key terms which are retained in the transition:

After all, the meaning of every term we use depends upon the theoretical con¬
text in which it occurs. Words . . . obtain their meanings by being part of a
theoretical system. Hence if we consider two contexts with basic principles that
either contradict each other or lead to inconsistent consequences in certain
domains, it is to be expected that some terms of the first context will not occur
in the second with exactly the same meaning.10

Feyerabend goes on to say that the conceptual systems of two mutually


inconsistent theories must be “mutually irreducible (their primitives cannot
be connected by bridge laws that are meaningful and factually correct).”11
But two theories using the same terms with different meanings may be
perfectly compatible even if their formulations are contradictories of each
other.12 For example, the two formally contradictory expressions
‘(x) (FxDGx)’ and ‘(3 x) (Fx. ~ Gx)’ need not be incompatible if the
meanings of ‘F’ and ‘G’ in the first formula differ from those in the second.
Feyerabend’s puzzling thesis that successive theories are incompatible
and that their key terms differ in meaning may have its root in an overly
narrow construal of the idea that terms “obtain their meanings by being
part of a theoretical system.” If one construes this idea as implying that the
meanings of the key terms of a theory must be such as to make the theory
true, then the terms are, as it were, “implicitly defined” by the theoretical
principles, and the latter are true simply in virtue of the meanings thus
assigned to the terms—they are analytic. And a formal incompatibility
between two theories with the same key terms must then be taken to indicate
simply that those terms have different meanings in the two systems.
But while this consideration may lend plausibility to Feyerabend’s con¬
tention, its initial assumption hardly fits the character of scientific theorizing.
For if the principles asserted by a scientific theory are implicit definitions of
its key terms and hence analytic, the role of experiment and observation
and the need for empirical evidence are thrown into question. If—to con¬
struct a schematic example—Galileo’s law and Kepler’s laws were taken
to be definitive of “free fall” and of “planetary motion,” then there would
192 2. STRUCTURE OF SCIENCE

be no need for experimental or observational test. Moreover, empirical


data on the actual fall of bodies near the earth or on the actual motion
of planets about the sun would be irrelevant to those laws. If the findings
did not conform to the law, this would show only that actual fall is not
free fall as implicitly defined by Galileo’s law or that the actual motion
of the planets is not planetary motion as implicitly defined by Kepler’s
laws. The laws would be analytic; in order to make them applicable to their
usual empirical subject matter and thus to restore the relevance of empirical
testing, they would have to be supplemented by laws to this effect: the
fall of a body in a vacuum near the surface of the earth is free fall as
characterized by Galileo’s formula; the motion of the planets about the
sun is planetary motion as characterized by Kepler’s laws. But the “theories”
obtained by such supplementation clearly are no longer analytic; their
terms are not implicitly defined by them. It does not follow, therefore, that
the transition to a new theory using some or all of the key terms of its
predecessor must involve a total change in the meanings of those terms,
including even those that serve to describe observational findings.13 Indeed,
it is quite unclear how the two theories could be held to present different
and even conflicting conceptions of some class of empirical phenomena
if the sameness of the subject matter and the details of the theoretical con¬
flict could not be characterized, at least in part, by means of terms whose
meanings remain constant in the transition.
But if we grant that, in the development of science, a new theory will
normally be incompatible with the one it replaces, and if on the other hand
we reject the view that the two theories in question are conceptually “incom¬
mensurable” in the sense that “the meanings of their main descriptive terms
depend on mutually inconsistent principles”14—then how are we to construe
the logical relationship between such theories, and how is the concept of
reduction to be modified to fit situations of this kind?
Kemeny and Oppenheim have proposed a construal of reduction15 which,
with slight stretching, might accommodate cases of this sort, though it was
avowedly not intended for that purpose.10 According to this construal,
reducibility is not a relation just between two theories but a relation in
which one theory, T2, may stand to another, 7\, with respect to some set
of observational data O. Briefly, T> is said to be reduced to T1 relative to O
if (a) 7\ explains any part of O that is explained by T>, (b) 7\ is at least
as well systematized as T2 (this notion is made clear only in an intuitive
fashion), and (c) the theoretical vocabulary of T2 contains terms that are
not contained in the theoretical vocabulary of IV If a more or less close, but
not exact, accord of the data with the theoretical principles is taken to
suffice for purposes of explanation, then this characterization would seem
to fit the relationship of successive theories which, strictly speaking, are
incompatible, provided that O is taken to be a set of data that T2 has in
fact served to explain. Thus, the Newtonian theory is much more highly
CARL G. HEMPEL 193

systematized (in an intuitively plausible sense) than the combination of


Galileo’s and Kepler’s laws; its theoretical vocabulary does not contain
the terms “free fall” and “the sun,” for example; and it accounts quantita¬
tively for the empirical findings that bear out, and are explained by, the
more limited set of laws.
But this construal of reduction also has some awkward features, which
throw its adequacy into doubt. Suppose that two theories are equally well
systematized, that they have no theoretical terms (and hence no theoretical
principles) in common, and that each of them explains a certain set of data,
O. Then either of the theories would count as reduced to the other (rela¬
tive to O), even though they do not share a single theoretical term or
principle. Reducibility of one theory to another requires more than that the
latter account for a set of data that the former can explain: The point
of the reduced theory is, after all, to offer general principles of which the
observed cases are just particular instances; and the reducing theory will
somehow have to account for these principles. In fact, the arguments that
are actually used in particular instances to show that a given theory is
reducible to another normally make little if any specific mention of the data
explained by the reduced theory; they rather aim to show that the general
laws asserted by the old theory are—within a limited domain, to which
its supporting data were restricted—approximations of what the new theory
implies for that domain. For example, the Newtonian theory implies that
within small altitudes above the surface of the earth, free fall will have
nearly constant acceleration, and that under specific conditions that are in
fact met by the planets of our solar system, the planets will move in very
close accordance with Kepler’s laws.
Several writers have therefore suggested a liberalized construal of the
reductive relationship between theories, or between a theory and a set of
previously established empirical laws that the theory is said to explain.
Under this construal, the new or reducing theory would be required to
imply that, within a certain domain, close approximations of the principles
of the old theory hold good. This view has recently been urged by Smart
and by Putnam17 in a discussion of Feyerabend’s position. Feyerabend’s
reply is that in this kind of reasoning, “real theories, theories which have
been discussed in the scientific literature, are replaced by emasculated carica¬
tures”18—namely, by the approximations of them that are derivable from
the new theory.
But this reply, it seems to me, is a verbal quibble. I think that both sides
in the dispute have contributed important insights, as the following analogy
—which in some respects is more than an analogy—might serve to show.
Consider a group of investigators who live in a small area on the surface
of a large sphere and who try to ascertain the geometrical structure of that
surface by suitable measurements. Initially, when their limited technology
restricts their movements to relatively short distances and permits measure-
194 2. STRUCTURE OF SCIENCE

ments of only moderate precision, they obtain a set of geometrical findings


which are very satisfactorily explained by the theory E, that their surface is
a Euclidean plane. Later, with an advanced technology that increases the
region accessible to them and improves the accuracy of their measurements,
the investigators obtain results that deviate more and more markedly from
what E would lead them to expect; and a new theory S gains acceptance,
which declares the surface to be spherical.
This new theory strictly contradicts its predecessor. But in a perfectly
reasonable sense, we may say that S does not show E to have been totally
mistaken; for S implies that within regions as small as that in which the
original measurements were made, the propositions of E hold in very close
approximation: the angle sum of a triangle equals almost exactly two
right angles; the ratio of the circumference to the diameter of a circle is
almost exactly equal to n\ two “straight lines” which intersect a third one at
equal angles do not intersect each other, and indeed their distance is practi¬
cally constant; and so forth. In this sense, the new theory incorporates the
old one as a close approximation applicable in a limited domain. To be sure,
the new theory shows the old one in a new light; and as a rule, in the
development of science, it is only in the light of the new theory that the
nature and the extent of the limitations of the old one become identifiable
and explainable. But within the limits thus indicated, the old theory may
well continue to be used (as is illustrated by the cases of geometrical optics
versus wave optics, classical mechanics versus relativistic mechanics, etc.).
It is certainly important to note that the new theory contradicts the old
one, and that the resulting change in science is not cumulative in the strict
sense of preserving the content of the old theory and simply adding on to
it. But neither is the change entirely discontinuous; as the preceding con¬
siderations show, it does not, in general, amount to a total rejection of
the earlier ideas and their replacement by new and “incommensurable”
ones. Indeed, the two theories are concerned with the same subject matter
in a sense that is intuitively clear and that can be made more explicit by
examining the relations between the key concepts of the theories E and S.
The vocabularies of these theories overlap to a large extent; both of
them contain the terms “point,” “line,” “incident on,” “intersects,” “angle,”
“triangle,” “circle,” and many others. Do these terms have the same mean¬
ings in the two systems?
The standards of a very narrow “operationism” might suggest an
affirmative answer on the ground that the basic operational criteria of
application for each of these terms are the same in both cases: in either
theory, a straight-line segment would be operationally represented by a
straightedge or by a taut string; points might be interpreted as small marks
on the surface under study or as scratches on straightedges or as knots in
strings, congruence of line segments by the coincidence of boundary
markers. Also, the basic measurement of distances and of angles would
CARL G. HEMPEL 195

follow the same procedure in both cases, and so would the construction
of triangles or circles. There is good reason, therefore, for denying that
the two theories are incomparable, and for maintaining that they concern
the geometrical structure of the same surface, and that the concepts in
terms of which they characterize that structure agree to a considerable
extent in. their empirical reference as characterized by their operational
criteria of application. Indeed, otherwise it is not clear how they could
be held to contradict each other. The contradiction manifests itself, for
example, in the fact that according to S, the ratio of the circumference of
a circle to its diameter is less than w, whereas according to E, it equals
Ti-; and these conflicting claims lend themselves to test by means of the
common operational procedures specified for the relevant measurements in
both theories.
But the agreement in basic empirical interpretation assuredly cannot
be taken to show that the meanings of the terms in question remain entirely
unchanged in the transition from E to S; in a plausible sense, they are
indeed affected by the difference of the theoretical principles in which they
function. This must be admitted even from a more liberal operationist
point of view; for the principles of a scientific theory normally give rise to
what might be called derivative operational criteria of application for
some of its terms. Thus, the principles of E imply that two lines which
intersect a third one at right angles are parallel; and this consequence
may be used as one operational criterion for parallelism. But this criterion
is not transferable to the theory S, whose basic principles do not yield
the requisite implication; indeed, there are no parallels at all in spherical
geometry.
There remains the question whether, apart from the matter of derivative
operational criteria, the meanings of scientific terms are determined by
the theoretical principles in which they occur.19 Judging the issue intuitively,
it seems plausible to say that to some extent this is indeed the case. Thus,
the principle that mass is additive—that the mass of a physical system equals
the sum of the masses of its components—might be regarded as expressing
part of what the term “mass” means in classical mechanics; and the same
view might be taken of the principle of conservation of mass. Similarly,
the law that all electric charges are integral multiples of the charge of the
electron might be regarded as expressing part of what is meant by the term
“electron.”
But, plausible though they are, these construals may well be questioned.
Suppose, for example, that the existence of particles with charges smaller
than that of the electron were to be established. It is quite likely that the
term “electron” would then continue to be used, but without the claim
that the electronic charge is the smallest possible one. In this event, would
we have to say that there had been a change in the meaning of the word
“electron,” that we were no longer speaking of the same kind of thing as
196 2. STRUCTURE OF SCIENCE

before—or could the change be described instead as one in our empirical


beliefs about electrons, as a change showing that we had learned something
new about electrons? I think the process could be described in either way;
neither of them is unequivocally correct nor even decisively more illuminat¬
ing than the other.20
The heated debate over the question whether theoretical terms change
their meanings as a result of changes in theory has largely been carried on
without benefit of a clearly delineated notion of meaning by reference
to which claims and counterclaims might be adjudged. What would be
needed is a concept, and an associated criterion, that would determine a
division of all sentences asserted by science at any given time into two
classes: those which reflect the meanings of their constituent terms and
which are true solely by virtue of those meanings; and those expressing
empirical claims. As for this desideratum, however, I share the doubts
expressed by Quine21 and others concerning the adequacy of the various
philosophical attempts at explicating the analytic-synthetic distinction—
and even concerning the significance of that distinction. What is being
claimed for a statement by saying that it holds true by virtue of the meanings
of its constitutive terms? One familiar conception of analyticity would sug¬
gest that the sentence is then regarded as inaccessible to refutation by
empirical evidence. But, as we noted, the statement that electrons carry the
smallest possible charges, even if viewed as reflecting part of the meaning
of “electron,” may well be given up in the face of certain new empirical
findings. The reply that the change in this case is one in meaning rather
than in beliefs concerning empirical facts is hardly very helpful for an
attempt to clarify the distinction between truth in virtue of meanings and
truth in virtue of facts.
In any particular scientific investigation of a given subject matter—such
as electrons and their charges—it will surely be necessary to agree on some
identifying characteristics of the subject matter and of those of its features
that are to be examined; and the statements specifying those characteristics
may then be said to indicate the meanings of the relevant terms. But the
class of statements serving this purpose for a given term or set of terms may
change, to some extent, from one investigation to another. Statements
might perhaps be said, with Quine, to be more or less central to the meanings
of the terms they contain, greater centrality indicating greater reluctance
to abandon the statement for the purpose of accommodating new evidence,
and also a stronger inclination to view such abandonment as effecting a
conceptual change. For the term “electron,” for example, each of the
following statements would seem to have greater centrality than the one
following it: electrons are subatomic particles; electrons carry uniform
negative charges; any electric charge is an integral multiple of the charge
of the electron; the charge of an electron equals 4.802 x 10 10 electrostatic
units. Laws or theoretical principles that may plausibly be construed as
CARL G. HEMPEL 197

relevant to the logical form of a term would be very central to its meaning;
thus, the transition from classical mechanics to the special theory of
relativity is often said to have effected a change in the concept of mass,
which in the classical theory represents an intrinsic characteristic of a
physical body, in relativity theory a relational feature.22
Our discussion of conceptual change had as its point of departure a
comparison of the two theories E and S in our example. The relations
we noted between these theories seem to me to obtain also between a set
of empirical laws (such as Galileo’s and Kepler’s) and a theory that is
said to explain them (e.g., Newton’s theory of gravitation and of motion);
and similarly between two theories of which the first is said to be reducible
to the second (as in the case of classical mechanics and the mechanics
of special relativity, or geometrical optics and wave optics, or classical
thermodynamics and statistical mechanics, or Bohr’s theory of atomic struc¬
ture and Sommerfeld’s refinement of it).
But the analogy to the relation between E and S does not extend, it
seems, to all pairs of scientific theories one of which supersedes the other.
The phlogiston theory and the oxygenation theory of combustion are
not related in such a way that approximations of the principles of the
former are derivable from the latter; according to the oxygenation theory,
phlogiston does not participate sometimes or to some degree in chemical
reactions: there just is no such thing as phlogiston, and combustion is
a process quite different from dephlogistication. Cases of this kind seem
to accord better with the conception, proposed by Kemeny and Oppenheim,
that one theory is reduced to another if the second accounts in a more
systematic and economical fashion for the empirical findings explainable
by the first.
In sum, then, the construal of theoretical reduction as a strictly deductive
relation between the principles of two theories, based on general laws that
connect the theoretical terms, is indeed an untenable oversimplification
which has no strict application in science and which, moreover, conceals
some highly important aspects of the relationship to be analyzed. But the
characterization of that relationship in terms of incompatibility and incom¬
mensurability overemphasizes certain significant differences and neglects
those affinities by virtue of which the reducing theory may be said to offer
a more adequate account of the subject matter with which the reduced
theory is concerned.

NOTES
1. The writing of this essay was supported in part by a research grant from the
National Science Foundation. , TT „ D .
2. R. Carnap, The Logical Syntax of Language; New York: Harcourt, Brace and
Co., 1937, Part V. A less technical statement of the basic idea is given in Carnap,
Philosophy and Logical Syntax; London: Regan Paul, Trench, Trubner and Co.,
1935.
198 2. STRUCTURE OF SCIENCE

3. See, for example, pp. 249-250 of R. Carnap, Introduction to Semantics; Cam¬


bridge, Mass.: Harvard University Press, 1942.
4 J A Shaffer, “Recent Work on the Mind-Body Problem,” American Philo¬
sophical Quarterly 2, 81-104 (1965); p. 93. (The identity theory is not, however,
the only one that would account for the empirical association: A dualistic parallelism
would be another possibility; but this issue will not concern us here.)
5. For a thorough discussion of the concepts of deterministic theory and deter¬
ministic system, see Nagel, The Structure of Science; New York: Harcourt, Brace
and World, 1961; chapter 10. ... „
6. Nagel refers to the requirements that there be such principles as the ‘condition
of connectability” (see op. cit., p. 354, p. 433). In accordance with the convention
introduced in Section 2, such connecting principles would count as additional bio¬
logical laws.
7. See Nagel, op. cit., Chapter 11, Parts IV and V, and Chapter 12. On pp. 364—
366, Nagel comments critically on what is, in effect, an implicitly ontological con-
strual of reduction.
8. For a fuller discussion of these and other examples, see P. K. Feyerabend,
“Problems of Empiricism,” in R. G. Colodny (ed.), Beyond the Edge of Certainty,
Englewood Cliffs, N. J.: Prentice-Hall, Inc., 1965, pp. 145-260; especially pp. 166—
177.
9. This brief summary, and the additional remarks in the following discussion,
cannot, of course, do justice to the many facets of this challenging thesis, nor do
they differentiate between the views and the supporting arguments of different pro¬
ponents of this idea. Fuller accounts by leading advocates of this view will be found,
for example, in the following publications: P. K. Feyerabend, “Explanation, Reduc¬
tion, and Empiricism,” in H. Feigl and G. Maxwell (eds.) Minnesota Studies in the
Philosophy of Science, Volume III, Minneapolis: University of Minnesota Press,
1962, pp. 28-97; “Problems of Empiricism” (op. cit.); “Reply to Criticism,” in R. S.
Cohen and M. W. Wartofsky (eds.), Boston Studies in the Philosophy of Science,
Volume II: Humanities Press, 1965, pp. 223-261. N. R. Hanson, Patterns of Dis¬
covery, Cambridge: Cambridge University Press, 1958. T. S. Kuhn, The Structure
of Scientific Revolutions, Chicago: University of Chicago Press, 1962. The central
ideas of these writers, and especially of Feyerabend, on scientific change are lucidly
surveyed and critically examined in D. Shapere, “Meaning and Scientific Change,”
in R. G. Colodny (ed.), Mind and Cosmos, Pittsburgh: University of Pittsburgh
Press, 1966, pp. 41-85.
10. Feyerabend, “Problems of Empiricism,” p. 180.
11. Ibid, (italics quoted).
12. This has been emphasized also by Shapere (see, for example, op. cit., p. 57).
Feyerabend has tried to meet this objection (see his “Reply to Criticism,” pp. 231-
233), but I have been unable to understand the possibilities he suggests of escaping
the difficulty.
13. Feyerabend deals with the meanings of observational terms and observational
reports in “Problems of Empiricism,” pp. 180-181, 202-218.
14. Feyerabend, “Problems of Empiricism,” p. 227.
15. J. G. Kemeny and P. Oppenheim, “On Reduction,” Philosophical Studies VII,
6-19 (1956).
16. loc. cit., p. 13.
17. J. J. C. Smart, “Conflicting Views about Explanation,” in R. S. Cohen and
M. W. Wartofsky (eds.), Boston Studies in the Philosophy of Science, Volume II:
Humanities Press, 1965, pp. 157-169; see pp. 160-161.—H. Putnam, “How Not to
Talk about Meaning,” in the same volume, pp. 205-222; see p. 207.
18. Feyerabend, “Reply to Criticism,” p. 229 (italics quoted).
19. For an illuminating critical discussion of this issue, supplemented by apt
examples, see also P. Achinstein, “On the Meaning of Scientific Terms,” The Journal
of Philosophy 61, pp. 497-510 (1964).
20. I find myself in essential agreement on this point with Achinstein’s remark:
“The fact that a ‘changed in meaning’ label may be unwarranted in a given case
does not necessarily imply that an ‘unchanged in meaning’ label is better. Both can
be misleading . . (op. cit., p. 504) and with Shapere’s conclusion: “Both the
thesis of theory-dependence of meanings . . . and its opponent, the condition of
meaning invariance, rest on the same kind of mistake (or excess). This does not
mean that there is not considerable truth ... in both theses.” (op. cit., p. 70.)
21. As set forth, for example, in “Two Dogmas of Empiricism,” reprinted in
W. V. O. Quine, From a Logical Point of View; Cambridge, Mass.: Harvard Uni-
CARL G. HEMPEL 199

versity Press, 2d ed. 1961; and “Carnap and Logical Truth” in P. A. Schilpp (ed.)
The Philosophy of Rudolf Carnap-, La Salle, Illinois: Open Court, 1963; pp. 385—
406. Carnap’s reply on pp. 915-922 of the same volume is of great interest for the
problem here under discussion; and so is H. Putnam’s essay, “The Analytic and the
Synthetic,” in H. Feigl and G. Maxwell (eds.), Minnesota Studies in the Philosophy
of Science, Vol. Ill; Minneapolis: University of Minnesota Press, 1962, pp. 358-397.
22. On this point, see the fuller discussions by Nagel, in The Structure of Science,
p. Ill and by Feyerabend, in “Problems of Empiricism,” pp. 168-169.
THE REALIST-INSTRUMENTALIST
CONTROVERSY1
Sidney Morgenbesser

Recently many philosophers and scientists have argued that the con¬
troversy between the realist and the instrumentalist, a controversy which
according to some antedates even the rise of modern science,2 has been
settled primarily but not exclusively by developments within the sciences in
favor of realism. And though Carnap, Friedman,3 and other distinguished
scientists and philosophers demur, the received opinion seems to be that
instrumentalism has ceased to be a viable option in the philosophy of
science. What then may we ask is this controversy about? Are realism and
instrumentalism, as Popper and others seem to claim4, the two leading
contrary approaches in the philosophy of science, at least one of which de¬
serves the allegiance of the committed philosopher? In what follows I
shall argue that realism and instrumentalism considered especially as
approaches to theories and their justification do not always compete and
that neither is the obvious victor when they do, thereby offering a negative
answer to the second question and rebutting the first. Note that the phrase
“approaches to the theories and their justification” is not merely decorative.
It is a truism that any two approaches in philosophy have some areas of
agreement. It ought to be an equal truism that realism and instrumentalism
—when the term “realism” is applied to the leading doctrines in the writ¬
ings of Popper, Feyerabend and other typical realists and “instrumentalism”
to key doctrines in the writings of paradigm instrumentalists from Peirce
to Lewis—agree on many important methodological issues. Realism has
emphasized its opposition to inductivism or the views that there are in Mill’s
sense rules for discovery of hypotheses, that scientific inquiry which aims
primarily to increase our knowledge goes through stages, the first of which
can be considered purely inductive and non-theory-dependent and has empha¬
sized the importance of the distinction between the degree of worth and the
degree of acceptability of a scientific hypothesis or theory. Similar ideas
were sponsored by many instrumentalists or if one wishes, pragmatists.
In the distant thirties Reichenbach5 wrote a warm appreciative essay
on the philosophy of John Dewey in which he claimed that the evidence
or arguments in favor of the reality of theoretical entities favored the
realists and condemned the instrumentalists who viewed theoretical en¬
tities as useful tools or fictions. Dewey’s answer was not clear, for he seemed
200
SIDNEY MORGENBESSER 201

to vacillate between denying the thesis that theoretical entities are real and
affirming the one that they are relational or functional ones. But though his
answer is not too perspicacious it does, I think, indicate the importance of
distinguishing for the purposes at hand between the thesis that theoretical
entities are real and the thesis that they exist. Alternatively Dewey’s answer
suggests that we leave open to the instrumentalists the option of denying the
reality and affirming the existence of theoretical entities. Of course the dis¬
tinction is a metaphysical one for it bears upon the attempt by Dewey and
other instrumentalists to deny the traditional metaphysical thesis which, to
use a once popular phrase, bifurcated nature and insisted that while theo¬
retical entities and phenomenal ones both exist, theoretical ones alone
are real. Dewey’s position6 could be interpreted therefore as the simple
one of affirming the existence of theoretical entities and attributing to
them no special ontological status, a position which is adopted by most if
not all methodological realists. On this account there is no incompati¬
bility between the instrumentalists’ insistence upon the nonreality of theo¬
retical entities and the realist position as thus far introduced.
Of course often enough some instrumentalists did deny or appear to
deny the existence, not only the super-reality, of theoretical entities and
did claim or appear to claim that such entities are merely useful tools. But
the instrumentalists’ position as a whole suggests that we interpret these
claims to be directed against Machian types7 of programs which would not
even view scientific theories which countenanced theoretical or unobserv¬
able entities as admissible ones. Put positively, we can view the instru¬
mentalists as favoring the thesis that scientific theories are useful and that
scientists are justified in using them even if the entities they countenance
are fictional. Given this interpretation the instrumentalist is disengaged
from the silly if not self-contradictory thesis that nonexistent entities can
be used and is credited with a thesis that is not at variance with the one
with which we began as the realist one. Of course since the instrumentalists
lately considered and the realists differ on the existence of theoretical en¬
tities, some difference in their position remains. But if that difference is the
sole one, then instrumentalism and realism become not the exciting con¬
traries we have been promised but merely dull contradictories.
A critic may chide for failing to tell a more complete story which, among
other things, would remind us that the instrumentalists did defend not only
the banal thesis that theories even if false may be useful, but the contested
ones that (a) theories are not true or false and that (b) theories have a justi¬
fied use only if they serve as guides for action undertaken to satisfy our
noncognitive needs. With the fuller story, the critic continues a non¬
trivial controversy between the instrumentalist and the realist in the offing
and materializes it if we attribute to the realist the views that scientific
theories are true or false and that science aims at explanation in addition
to the ones already noted, namely that theoretical entities are real and that
202 2. STRUCTURE OF SCIENCE

science is distinguished by its employment of the hypothetical deductive


method. Note however that Lewis, who did opt for a sophisticated version of
(b), avoided (a) and that Peirce accepted not (b) but the thesis that scienti¬
fic theories serve as guides for action undertaken to solve specific scientific
problems, a near corollary of the familiar pragmatic thesis that scientific
terms are only understandable if we know the uses to which sentences in
which they appear are put by scientists. Observe further that pragmatists
often replaced “truth” with “workability” or “true” with “useful” when
they thought it essential to explicate the phrase, “believes in T,” taken as
a unit, where “T” stands in place of a description of a metaphysical prin¬
ciple. That is to say, pragmatists like Peirce often denied that scientific
theories presupposed metaphysical laws or principles (example given, the
law of causality), but did affirm that it may be true of scientists that they
believed in certain metaphysical laws or principles (once again, the law of
causality). Committed as they were to a dispositional analysis of belief, they
ended by attributing to the scientists certain habits and propensities to act
which could only be good or useful, not true or false. Needless to say, such
an analysis by pragmatists of a reconstruction of metaphysical principles
as expressions of habits need not entail any thesis about bona fide candi¬
dates for empirical laws. The difficulty then with the critic’s story is double.
It suggests falsely that instrumentalists spoke with one voice on the issues
it covers; if fails to note the multilayered aspects of the pragmatic position.
What then is the controversy between the instrumentalist and the realist?
I suggest that we have been misled by the question and have been un¬
successfully hunting for the instrumentalist, and draw the corollary that
we change tactics and look for more than one, at least two or, if one wishes,
two and a half. And once we change our perspective we can use some of
the previously noted diverse standards to distinguish between the noncog-
nitivist and the contextualist instrumentalist. Luckily there is no need to
multiply ideal type entities beyond necessity, and we can rest with one
realist whom I shall tag the methodological realist and whose position is easy
enough to predict. He will be identified as a defender of the conjunction of
the four already encountered theses: (1) that theoretical entities are real,
(2) that scientific theories or laws are true or false, and (3) that science
aims at explanation, and (4) that science is committed to the hypothetical
deductive method. The easiest way of characterizing the noncognitivist
is to have him simply deny 1, then 2, and then 3. But since negative philos¬
ophy alone won’t do, we note that the weak noncognitivist insists that
laws and theories are best construed as rules of inference and that science
aims at explanation and prediction, in contrast to the strong noncognitivist
who refuses to attribute sense to scientific theories which countenance
theoretical entities and insists that scientific theories have predictive use
only. As can be gathered from his name, the contextualist is cagey and is
for the most part agnostic about the issues we have been considering. But
SIDNEY MORGENBESSER 203

he is not entirely silent: he reminds us that scientific terms can only be


understood by reference to the uses of the sentences in which appearance
is put by scientists and warns against the literal interpretation of scientific
theories. He adds that scientific theories are justified if they solve the prob¬
lems they were designed to solve and concludes that scientific inquiry sets its
own aims. It is apparent that we have only given curtain raisers, and that
more detailed description of these positions is necessary. But it may
not be apparent why we have gathered the two last positions together as
instances of instrumentalism. The reason is, I think, primarily historical.
Both types have been defended by instrumentalists, who are disposed to
take a naturalistic approach to language,8 to view such linguistic entities as
scientific theories as tools constructed by men to cope with their environ¬
ment, and to examine the various interconnections between knowledge and
action. But since the noncognitivist tended to emphasize and the contextualist
to de-emphasize the differences between linguistic and nonlinguistic tools, we
end with varieties of instrumentalism.
The various theses defended by the methodological realists are not arbi¬
trarily yoked together. The link between R2 and R3 is the nomological model
of a satisfactory explanation, between R1 and R3 the history of science
which proves to the realist that science cannot achieve its aim without con¬
structing and confirming theories which postulate theoretical entities and
which can only be indirectly tested. Hence R4. But though methodological
realism is not a mere heap, it is not an organic whole whose theses resist
transplantation on instrumentalist’s soil. I suggest that some of its theses
may be combined with and some helpfully modified in the light of instru¬
mentalistic ones, perhaps to the benefit of all, and especially to the benefit
of our quest for concord rather than discord between instrumentalism and
realism. In considering these issues I shall assume that we can avoid defin¬
ing “theoretical entity” or even “theoretical term” but can begin by taking
it for granted that “electrons” and “protons”—“perfectly competitive in¬
dustries”—are theoretical terms, and that “table” is not. Two notes may be
added, (a) I shall assume that it is sufficient for S to be a theoretical en¬
tity if S cannot because of nomic consideration be directly perceived and
hence that theoretical entities are inferred ones. But of course the converse
does not follow; I shall not say that inferred entities need be nonobservable,
(b) In most cases when we say that S is a theoretical entity we do mean
to say that S exists and is theoretical, but obviously not in all cases. When
necessary I shall rhetorically refer to the latter kind, among which would be
perfect gases as ideal entities.9
It is obvious that R1 does not entail R2, R2 being about linguistic en¬
tities and R1 not. Hence there is a simple way in which one can claim the
compatibility of R1 with 12—the thesis that theories are not true or false.
But this way of severing the connection between R1 and R2 is a trivial
one and open to the objection that R1 and R2 are pragmatically related in
204 2. STRUCTURE OF SCIENCE

the familiar sense that anyone who asserted R1 and denied R2 would be
engaged in linguistically odd practice. But on what grounds? It might be
claimed that anyone who accepts R1 accepts a theory, and hence rejects
the instrumentalist’s thesis that no theory is true or false. But though this
claim is reasonable, it is not entirely persuasive, since it rests on the de¬
batable assumption that the terms “theory” and “hypothesis” are always
interchangeable. But for the purposes at hand it seems reasonable to rule
that S is a hypothesis if S is not entailed by the total evidence that supports
it and that S is not a theory unless it is lawlike, or at least lawlike appearing.
Given this ruling, though theories may be and most would be hypotheses,
not all hypotheses are theories—a conclusion which removes the force of
the objection. But only momentarily. We will be reminded that anyone
who accepts R1 does so because he accepts as true such statements as
“there are electrons,” which are believed only because scientific laws and
theories are accepted. Once again, 12 seems to be pragmatically incom¬
patible with Rl. True enough. But this objection merely shows that the
acceptance of Rl entails the acceptance not of R2 but of its modified ver¬
sion which reads that some laws and theories are true or false, and that
others need not be, a thesis which can also be viewed as a modified instru¬
mentalist’s one. Note in further support of this conclusion that no one argues
that the historical facts of life alone assure the meaningfulness of “John
Stuart Mill is a blue idea who slept peacefully,” and hence that no one need
deny the reality of the subatomic phenomena while challenging the noncog-
nizability or nonmeaningfulness of some specific general theory about it. But
though no one is forced to go from Rl to R2, who, it might be asked, would
be interested in Rl and find a need for the dull nondisconfirmable version
of R2 just suggested. A malicious answer would be that many realists are
or appear to when for a variety of objections that they have to its standard
interpretation they consider quantum mechanics neither true nor false, and
suggest that it is best viewed as a calculating device.10
But the point is not simply an ad hominem one. The instrumentalist is
not only arguing against the realist, he is assuming that anyone will dis¬
cover some theories which he will find it hard to interpret and which he will
not dismiss from scientific use. Hence the instrumentalist is arguing that
anyone will adopt some version of his position which would at least rule
out R2. Moreover, the instrumentalist is arguing that a scientist is justified
in using theories which cannot be interpreted as true or false, and the
realist would not be justified if he argued that scientists are not justified in
using quantum mechanical theories which the realist cannot accept as
meaningful.
I have argued for a modified version of methodological realism, one
which accepts Rl and a modified version of R2. I add that there is
need of modifying R4, at least on one standard interpretation of it.
Those who have accepted Rl have done so because they have ac-
SIDNEY MORGENBESSER 205

cepted hypotheses and theories in the light of the total scientific evi¬
dence. There is, of course, nothing untoward in that, except for the fact
that acceptance of hypotheses seems to be ruled out by standard inter¬
pretation of R4 which insists that scientists qua scientists do not accept
hypotheses, or that the acceptance of a scientific hypothesis is the per¬
formance of an extra scientific act. More fully R4 insists that scientists
have a method or set of rules only for testing and possibly rejecting theories
and hypotheses and only for that, since that is all the hypothetical deductive
method can provide. Conversely, to allow for the acceptance of R1 we
must either deny the conjunction (a) that science is distinguished by its
method and (b) that the hypothetical-deductive method is the one, or at
least supplement R4 with a method for accepting hypotheses. But since
scientists do accept hypotheses I assume that such supplementing is neces¬
sary. There is an extra-dividend that can be derived from this last point.
On occasion those who accept R3 and deny that science aims at explana¬
tion and prediction do so because they insist that scientists qua scientists can
only offer conditional predictions and that any straightforward acceptance
of predictive utterance is not scientifically warranted. The acceptance of
Rl, if all right, rules out this latter move in defense of R3, and supports
13—science aims at explanation and prediction.11
Let us return to our beginning point. Thus far I have assumed, and I
hope not without justification, that we can discuss the interconnections
between Rl and the subsidiary theses without attending too directly to
Rl itself. The time has come to take a closer look which immediately
reveals that Rl itself is, for a variety of reasons, not a disconfirmable
thesis. We have already agreed that those who accept Rl do so because
they accept such sentences as “there are electrons and protons,” etc. It is
appropriate to note the lack of symmetry; namely that some may reject
any one of these sentences and still maintain Rl. It is also appropriate to
recall that R4 does not provide us with rules for acceptance of hypotheses,12
and draw the consequence that one philosopher may accept Rl and R4 and
accept only the reality of electrons and another the reality of electrons
and fields. Hence we may conclude that Rl does not present us with
refined clues about the ontology of the realist; at best we know that he
accepts some kinds of theoretical entities as real. But which ones?
It is not open to the realist to answer that he admits as real all entities
which the scientist finds it helpful and useful to countenance. For that would
force upon him not only perfectly competitive industries but also factors of
the mind, and Gibbsian ensembles.13
Rl is not, however, completely uninformative. If we know that some¬
one adopts it, we can predict with some reliability that he will not
dismiss as unclear any predicate simply because it is a theoretical one. Pre¬
dictions about specific predictions remain unreliable; a realist may have dif¬
ficulty with specific ones—e.g., “collapse” applied to wave function. Smart
206 2. STRUCTURE OF SCIENCE

and other realists have claimed that the thesis “theoretical entities are real
is intended to inform us that there is nothing defective about theoretical
concepts, that “electron” for example is as clear as “red.” Since by parity of
reasoning the realist should say that dragons are real, this explanation would
not do. But this is not the major objection to the realist’s explanation of his
use of R1 rather than of the simpler one “theoretical entities exist.” If I
am right, he is only in the position of saying that some theoretical concepts
are not defective. Note finally that given the realism of the ontology of a
theory the realism of a theory as a whole is not decided. For at least three
reasons, one of which is obvious. The theory may be deemed unrealistic
because its sentences describe nonrealizable or nonfrequently realized
phenomena; the theory may not be realistic because it is not objective
in the sense that at least one of its lawlike sentences contains a fragment
“Pa” which entails “Ra, b” where ‘b’ stands in place of a name or de¬
scription of a human agent.14 And since it is at least debatable whether
quantum mechanics satisfies this last criterion for objectivity and realism,
the claim of the methodological realist that acceptable physical theories
must be realistic is equally debatable.15
Again, the realist thus far does not inform us whether he accepts certain
types of entities as “real” ones, as self-existing ones and not merely as
temporary stages of others; and hence he does not inform us when theories
that countenance entities of type K need be supplemented with theories
which countenance entities of type L, of which K-type entities are temporary
stages. The problem raised is not an idle one, for it and its analogue were
raised when scientists asked whether fields are real, and still deserves at¬
tention. But on all accounts the realist may be described as defending the
nondisconfirmable thesis that some theoretical entities exist or its semantic
analog that some sortal predicates are true of some entities, and owes us
an explanation of his use of “real.”
That being the case, R1 can be viewed as a supplement to the con-
textualist’s position which warns us not to interpret theories and their terms
literally and which insists that we understand theoretical terms by studying
the uses to which sentences in which they appear are put by scientists.
Needless to say, this position is not a very clear one. But if we interpret it
in the light of the uses to which it was put, its thrust becomes relatively
apparent. Instrumentalists wanted to show that there is no need of viewing
terms which ostensibly introduced new K types of entities (e.g., points) as
doing so, but only as providing us with ways of predicting the behavior of
familiar objects, when they can be treated as if they were K type entities.
Instrumentalists therefore had a deflationary approach to the ontological
commitments of scientific theories which need not violate the spirit of Rl.
Why then in the light of its obvious limitations does the realist not merely
emphasize Rl but place it at the pinnacle of his position? I think he does
SIDNEY MORGENBESSER 207

so in order to emphasize the indispensability, not merely the importance,


of theories which countenance theoretical terms for explanatory purposes.
A popular thesis maintains that scientific disciplines typically go through
stages one of which consists of testing and confirming empirical and
experimental laws and the other of presenting and corroborating theories
which countenance theoretical entities and which can be used to explain
the previously accepted experimental and empirical laws. The realist ob¬
jects on two grounds: the popular view fails to mention that previously
laws were not explained but corrected in the light of new theories; the
popular view misleads us into thinking that we can even in part achieve
our aim of explaining nature by trying to confirm laws expressed in purely
observational language. Hence once again the interconnection between R1
and R3, and the opposition to the noncognitivist contextualist. It will be
remembered that the latter construes experimental laws and theoretical
ones as rules of inference and does not distinguish between them as can¬
didates for explanatory purposes, though he admits the latter may have a
greater potential explanatory power. Remember that the weak noncognitivist
can adopt Rl, and R3 simply (provided of course he finds a satisfactory
substitute for the nomological model of explanation) but does not mesh
them in the way the realist, for apparently good reasons, does.
But in defense of the standard story, notice that we may never reach our
aims of explaining nature even if we try to confirm general theories which
countenance theoretical entities. For, after all, they too are corrected as
inquiry proceeds, and, for all we know, no theoretical law will ever survive
as long as such simple observational ones as “no rock waltzes,” “no man
is born talking” have. Two other nontrivial objections to the realist account
of the matter may be noted. In his criticism the methodological realist vacil¬
lates between “lawful” and “law” and fails to note that lawful sentences are
replaced by others expressed in the same vocabulary. And though the
lawful sentences accepted at T are not, their successors are deducible from
the new theories, or at least not at variance with them. The realist also fails
to add that theories which countenance theoretical entities are frequently
embedded in theories which presumably apply both to observational and
theoretical entities and thereby overemphasizes the centrality for explanatory
purposes of theories which countenance theoretical entities.16
The sudden reference to the standard empiricist story might be taken as
evidence for a certain degree of unhappiness with the weak noncognitivist.
If so, the taking is correct. The weak noncognitivist does not carefully
distinguish between theories which are noncredible because they are sense¬
less and those which are noncredible because they are false or only slightly
confirmed. Moreover his reasons for treating the latter as rules of inference
are not always persuasive, and sometimes irrelevant. Thus even if his
claim that laws and theories are not used as premises in typical scientific
208 2. STRUCTURE OF SCIENCE

explanations be granted, we can burke at his conclusion that they function


as rules of inference. Actually his thesis is triple (a) laws and theories are
not true or false because among other things they (b) dont serve as
premises in explanations and hence (c) they are best construed as rules of
inference. I suggest that the weak noncognitivist would do well to accept
refurbished versions of (a) and (b) taken singly and in that manner sup¬
plement the realist.
Often we apply the term “theory” to a body of sentences which pre¬
sumably tells us a relatively complete story about a certain type of subject
matter, and sometimes we have applied it to the axioms in some privileged
formalization of the theory in the previous sense. Axiomatization which is
generally undertaken with the flight of Minerva is of many sorts; in one of
which has been developed by Suppes and others, the theory becomes a set-
theoretical predicate. Theories function as predicates which in an extended
sense are true of or satisfied by some entities; they are not themselves true.
Here then is one way of providing some support of the weak noncognitivist
claim (a), but it is itself a weak one. For the problem of the truth of the
theory is not really avoided, we still face that question when we want to
know whether the theory is satisfied in some intended model.17
The term “theory” is not only used in the august ways just indicated.
For on occasion it is applied to sentences or semi-sentences which indicate
the type of laws that the given discipline might develop, on other sentences
which though lawlike do not function as principles from which the relevant
laws are to be derived. At best the laws are in accord with them. The
latter type supports claim (b) of the weak noncognitivist, and the former
(a) ; taken together they weaken R4 if that thesis is intended to mean that
theories in the sciences always function as hypotheses from which laws are
derived and explained.
Consider for example the simple germ theory of disease (S): “everyone
who is diseased has been infected by a germ.” S is lawlike and can be used
for explanatory purposes but only for nonrefined ones. If we want to know
why A is diseased and B is not we might appeal to theory S, but if we want
to know why C has disease C and D has disease D, S is of no aid. To
explain the difference between C and D we need to discover more refined
laws which might not have been sought had there been no commitment
to S but which are not entailed by S. S-type lawlike sentences are theories
for rather than theories in a discipline D. The latter—e.g., Maxwell’s
equation—do stand as premises in refined explanations of D and do explain
the laws of D; the former do not. S itself is relatively simple, but more re¬
fined examples of the same type can be offered, some of which are not
falsifiable,18 and which many noncognitivists therefore somewhat mislead¬
ingly described as neither true nor false. What does remain of contentions
(b) and (c) of the weak noncognitivist is this: some theories are not used as
premises in explanation; scientists are not always in a position to say that
SIDNEY MORGENBESSER 209

their theories are true or false but only that they have proven fruitful. Some
use can also be found for contention (a).
Since Duhem it has been a commonplace that scientists do not apply
the term “law” only to sentences which are lawlike and true but also to some
that are false.19 Instrumentalists add that scientists also apply the term
“law” to expressions which are neither true nor false and which they
describe as holding or not holding.
One way of understanding these phrases is as follows. The scientist
when offering us a given law which doesn’t hold is presenting us with
causal-schematic expressions. He may say that the law is “All A is V”;
what he means to assert is that “All A • C is V” where ‘C’ functions as a
schematic letter to be replaced as inquiry continues. Causal schematic ex¬
pressions are neither true or false and are often called laws; rather some
laws are best understood as causal-schematic expressions. Everyday life
supplies some examples. A friend tells us that he has a headache and we
advise him to take an aspirin. How should we interpret our beliefs? It will
not do to suggest that we believe that everyone who has a headache and
takes an aspirin will be relieved of his aches and pains. And neither will it
do to suggest that we know of a reliably true statistical generalization
which we can use as a basis for a rational bit of advice. But two alter¬
natives remain. We may be only committed to the simple belief that for the
most part people who take aspirin are relieved of their aches and go no
further. However we may think that if anyone who is under some now
nonspecified conditions (C) has a headache and takes an aspirin he will,
soon enough, return to a state of normalcy. The differences between these
two commitments should be evident. In the first place, we do not neces¬
sarily think that through further inquiry we may be able to get an interesting
true law. In the second case, we think that in the future scientists may be
able to specify exact conditions under which some people do and some
people do not become headache-losers when they are aspirin-gainers.20
The example given supports the contention of the weak noncognitivist,
but it need not embarrass the realist, for he might insist that causal
schematic expressions and the simplified theories we have considered are
to be replaced by laws and theories which are true or false and which do
explain fully. Hence his contentions R(2) and R(3) are saved if construed
as a normative thesis or as a directive. Since the weak noncognitivist offers
descriptive theses, he has no answer to this version of realism. But the
contextualist whose position needs amplification does.
It will be recalled that the contextualist is concerned with the justification
of scientific theories and procedures, insists that the justification must always
be contextual, and adds that science sets its own aim. But since beliefs,
actions, and policies not theories simpliciter are justified, we must expect
the contextualist to have at least two kinds of theses: one about justification
of belief in the scientific theory, or the justified acceptance of that theory,
210 2. STRUCTURE OF SCIENCE

and the other about the justification of various types of acts or policies
which have the theories as their goals or instruments even if it is a simple
policy trying to confirm a lawlike sentence of a given kind.
It is a commonplace that a scientist may be justified in accepting a law¬
like sentence S at a given time even if each subsequent inquiry showed that
that S is false. And since it is equally obvious that a scientist may be justified
in not believing L even if L is true, it appears that there are no formulas
of the form “A is justified in believing S if and only if S is true,” or A is
justified in disbelieving S if and only if S is false.”
No wonder then that the contextualist, concerned as he is with issues
of justification, finds R2 of little help. It provides no guide for the con¬
struction of rules or formulas that can be used to assess justified acceptance
or rejection of laws, theories, and hypotheses. At this point the realist might
raise the cry of foul and accuse the contextualist of cheating for neglecting
to attend to the obvious formula: a scientist is not justified in accepting a
given theory or law if it has been falsified.
Nothing seems more reasonable, and if the realist gives good reasons
for believing this is the only rule that is needed, the contextualist s case
seems to be a weak one indeed. But one charge invites another and the
contextualist countercharge is easy to construct. Unless a scientist knows
that a theory has been falsified he need not reject it. Hence the realist has
cheated for not presenting the formula thus: a scientist is not justified
in believing T (or not rejecting T) if he knows that T has been falsified.
But stated that way the rule is otiose: a scientist who knows that a
theory is false cannot decide to believe that it is true; he must reject it.
Remember, that to reject a law or theory is not to discard it; one may
decide to keep a rejected law on the books for a while and seek ways to
correct it. But since the realist has no rules to offer as help to a scientist
who is deliberating whether to discard or to correct a rejected law, he
cannot claim that he has offered rules, if not for acceptance, at least for
the rejection of hypotheses.21 The instrumentalist adds that that the realist
has, in accepting R4, accepted hypotheses and hence cannot turn his back on
the problem of justified acceptance of hypotheses.
But since the contextualist does not merely hive off the mistakes or
omissions of the realist, we may expect him to have his own views to defend
about the topic of justification of belief, or justification of the acceptance
of hypotheses and theories, as credible though revisable. What then are his
views? The standard answer takes the contextualist at his name value and
portrays him as maintaining that there is no general problem of justification,
that there are only specific contextual ones. Presumably, at least three
things are intended. The contextualist is viewed first as contending that
the problem of justification of knowledge in general and the one of trying to
show how knowledge is possible are pseudo-problems, and secondly, as
transforming a commonplace into a complex principle. Normally—that is,
SIDNEY MORGENBESSER 211

when we are in control of our faculties and are not playing Epistemology
I games—we don’t dismiss all our neighbors’ beliefs as equally worthless
but challenge specific ones which we think have specific failings. From
this commonplace the principle (A): If A accepts S as credible, then B
can (in the ethical, not causal sense) challenge A only if he has specific
objections to S, or adduces reason against the credibility of S, and (2) A
can justify his acceptance of S only by answering the specific objections
raised by B, by showing that S is in accord with the criteria or rules that
the reasons addressed by B indicate. The matter does not end here. There
is a third note sounded by one contextualist. In answering B, A will most
likely appeal to other sentences Tx . . n which he will use to support S.
But there is, for the contextualist, nothing special or blessed about Tx. . n.
Though accepted here, they may be challenged later and be transformed
from justifiers to sentences in need of justification. For the contextualist
then, sufficient unto the context the justification thereof.
To understand the contextualist position a little more amply, let us try
to answer an obvious criticism. It may be contended that the contextualist
has made progress with A only by overlooking the difficulties faced by or
raised by B, and has done so by shifting the onus of responsibility upon B,
or upon us the critics, of justifying B’s doubt, or of showing that B’s reasons
are legitimate ones and justify his doubt or agnosticism. The criticism is
apt but not crushing; the contextualist has not shifted the problem from
A to B, he assumes that B has nothing to justify to A. “Assume” is too
weak a word: the contextualist notes that the reasons which B advances
(or the rule to which he appeals) must be accepted by A as good ones,
for otherwise he would not attempt to answer B. A related answer can be
given to a corollary of the criticism just reviewed. It may be claimed that
the contextualist has shifted the problem of justification away from S to
the other sentence Tx . . n which A appeals to in his support of S. But
surely it may be contended A cannot appeal to any odd set of sentences,
but only to those justified in a sense thus far unspecified by the context¬
ualist. The answer is apparent. The contextualist is assuming that Ti . . n
are accepted by A and B and that A has nothing to justify to B about his
acceptance of Ti . . n.
But, though the contextualist may not have shifted any responsibility
we may have shifted our explication or at least its main emphasis. We began
with a standard view which emphasises the contextual features of justifi¬
cation; We ended with a social theory of justification—“Social” in at
least two senses: (A) that it takes at least two to tangle and (B) that the
disentangledness or the process of justification cannot begin unless those
engaged in the process share certain beliefs and certain rules or criteria.
Hence, allowing for a few steps, the instrumentalist position comes down in
part to this: that a scientist is not justified period, but justified to those
engaged in the scientific enterprise if his beliefs are in accord with the
212 2. STRUCTURE OF SCIENCE

self-imposed rules of the scientific community. The phrase “in part” sug¬
gests that there is more to come. There is. The contextualist position pro¬
vides us with a partial explanation of the central role that intersubjectively
confirmable observation sentences and inductive rules play in science. The
point about the former is obvious; a word or two may suffice about the
latter. Inductive rules not only express universally shared habits whose
acceptance no one need justify to anyone else; they also express habits we
cannot choose to have and whose acceptance we never have to justify.
To base science on induction is therefore to have a base whose social
justification never arises. Still, it would not do to say that science is, for
the contextualist, based on induction or rests upon an observation or is, as
defenders of R4 might claim, characterized by the employment of the
hypothetical deductive method. What then characterizes the scientific com¬
munity? The previous reflection provides us with a clue. Science for the
contextualist is characterized by the fact that its members share higher-order
attitudes about beliefs and attitudes to revise them in light of the objections
to them brought by others.22
It must, in all candor, be noted that many instrumentalists do not think
that the story ends at this point and that others insist that it doesn’t even
begin with it. For many of the latter have been influenced by Lewis, that
the central problem of justification remains the traditional epistemological
one of trying to explicate “A knows that S” and to indicate for any S the
conditions that have to obtain before it can be known. The issues raised by
these conditions are too complex to be entered into here, so I will note
that I think that the latter contextualists are correct and that their views
must be taken as a supplementary view to the one raised by the social
theory of justification just reviewed. A less charitable note is, I think, in
order about the theses defended by some contextualists who have attempted
to show that rules which are appealed to for the purposes of justification
may themselves be justified by referring to the role they have played in
inquiry, that they have been tacitly obeyed in cases where problems were
solved and were disobeyed when problems were generated. Perhaps they
are right, but if so they are defending a theory which must be justified in
the same way any as any theory must be justified. Secondly, and perhaps
more to the point, I do not think that this approach to justification of rules
has been fruitful and has led to the discovery of any new type of scientific
rules. In the main, traditional inductive rules have been discussed and
their justification attempted by instrumentalists—unsuccessfully, I think, for
though the problems of inductive support for inductive rules may be man¬
ageable, I do not think it is identical with the one of showing that inductive
rules have been problem-removers. But the appeal to problem-solving is
not to be dismissed. It plays a role in the justification of acceptance of
theories and hypotheses not as credible but as worthy of testing and devel-
SIDNEY MORGENBESSER 213

oping or as usable for a given purpose. I advert to some of the issues in¬
volved.
In assessing their work and the work of their colleagues, scientists often
defend theses to the effect (a) that a scientist may be justified in trying to
develop a theory though it is believed to be false,23 (b) that a scientist
may be justified in using T\ for predictive or calculating purposes and
using T2 for related ones though their conjunction is self-contradictory,24
(c) that a scientist may be justified in using Ti for a given purpose and
neglecting T2 though T\ and T2 are logically equivalent.25 These and
kindred theses motivate the plausible instrumentalist thesis that a scientist
is justified in using a theory T to accomplish a given end, if he has good
reasons for believing his theory-guided act will accomplish his end.
But, though plausible, the thesis needs emendation and qualification. That
the thesis needs emendation will be evident when we recall that a scientist
may be justified in doing T even if he has little reason to believe that T
will accomplish his end, but has little option and that doing T may have
more expected utility than any other available one. And, that the thesis
needs qualification becomes equally evident when we recall that a person is
usually considered not necessarily justified but only rational if he accom¬
plishes his end in an effective manner; attribution of justification requires
attention to the worthwhileness of goals and ends, not merely to the efficiency
of behavior. But though it is evident that the thesis needs qualification,
emendation, and supplementation, it is not evident how to supply the
requisite revisions; for we need information about shared scientific ends
and ways of ranking them, which the contextualist does not supply. He is
of course not entirely silent on these matters for he provides a hint when he
suggests that inquiry sets its own aims, which in turn suggests that the ends
of science are to be assumed as worthwhile and none to be dismissed as
denigrating. But how to develop this hint into a program for ranking
scientific ends and assigning different utilities to them is not evident.
Despair may, however, be premature. R3 may perhaps be used to sug¬
gest a strategy, and if under it does do successfully, we may once again
find support for our irenic metathesis that realism and contextualism may
reinforce rather than undermine each other. As I understand it R3 does
not deny that theories are used for calculating, predictive, postdictive pur¬
poses as well as explanatory ones and, hence, is not at variance with recent
theses which emphasize the varied uses to which scientific theories are
put. R3 adds to these observations the quasi-normative one that the act of
explaining has so much more utility than other types of scientific acts that
we can, for theoretical purposes, neglect the others and argue that a
theory Si has more potential worth than S2 if Si has more potential explana¬
tory power in than S2 and similarly far more actual explanatory power. We
seem to have progressed. The problem of specifying rules or criteria for the
214 2. STRUCTURE OF SCIENCE

assessment of scientific action appears to be complex, the realist has by re¬


stricting himself to one criterion simplified matters. But simplicity is here
no mark of truth.
Consider first the case of an advanced science which contains one ac¬
cepted well-confirmed theory which has no rivals, explains much, and
leaves much unexplained. There being no rival theories, there is apparently
no need for a realist directive to guide choice, but there may be a place
for a realist bit of advice: namely, to seek a more general theory which
would, by explaining more, be superior to the accepted one. And since we
have grown accustomed to the thesis that the work of science is never done,
the directive seems appropriate. But here appearance may be deceptive,
for though the work of science may never be done it may, given the current
theory, be done in a given area. For the current theory may be true and
may truly be interpreted to show, not merely that some things are unex¬
plained, but that they must remain unexplained.26
The converse situation arises in sciences or disciplines which are in their
inductive stage and are devoid of any general theory, much less a number
of them which do not merely guide research but which act as explainers in
their disciplines, theories in the discipline, to use our previous terminology.
Once again, the realist can supply not a criterion for choice but a directive,
here to seek a general theory which will be usable for members of the
discipline and enable them to deal in a uniform manner with their subject.
Our comment here is reminiscent of the one just made. We have grown
accustomed to advice of this sort, and indeed it is notorious that every
discipline D has its members out on a search for a general theory in D.
Still, custom has its limitations and must occasionally be challenged. Here
the problem is that it may be a waste of time to seek a general theory in
D, the truth being that every D-type phenomena can be dealt with by
some general theory—not necessarily the same one—of another discipline.
It is a commonplace that we forecast the weather without benefit of a gen¬
eral theory of weather; it should perhaps be as commonplace that we should
avoid the quest for general theories of crime, business cycles, marriage,
war, and divorce. The realist must supply us with some estimate of the prob¬
abilities of the true situation if he is to prove us with advice which we
should follow, and I fear that he cannot.
Most disciplines are not in the blessed or freshman stages just discussed,
but contain theories of mixed virtues and defects. The realist seems to be
on his strongest grounds when he theorizes about theories which have some
virtue: those that are confirmed and not disconfirmed and do have various
degrees of explanatory powers. In this case the realist need merely directs us
to choose among the rival theories, and accept the one which is strongest or
simplest—or most disconfirmable—whichever mark is used as a mark
of explanatory power. Once again, the realist contention seems reasonable;
we have all become accustomed to the view that science is risky and seeks
SIDNEY MORGENBESSER 215

scope and system at the price of safety. But, here too, custom is not a
good guide and must be challenged. The problem here is that the realist
is not clear, I think, on what the scientist is presumed to do, and vacillates
between two directives: (1) accept as credible, and (2) accept as worthy
of test and development. If on the basis of the data “All Americans are
mortal” and “All men are mortal” are confirmed, and neither discon-
firmed, the former would seem to be more credible, the latter more
worthy of test. But we might say that B would be more worthy of test even
without a glance at the data on the grounds that its potential explanatory
power or its potential degree of worth is greater.27
The remaining task is to compare defective theories to each other and
to rival virtuous ones, and it appears to be a self-liquidating task. For
given a choice between a defective (falsified) theory, and a virtuous one
(nondisconfirmed) we must choose the latter, given two defective theories,
we must choose neither. Nothing is neater and nothing more irrelevant as
an answer to the instrumentalist problem with which we began. We wanted
to know how to justify the use of defective theories; we ended with a
criterion for their rejection as noncredible. And that we would cast no fight
on the instrumentalist problem is evident when we recall that for a realist a
theory must be credible, for otherwise it would lack explanatory power.
Hence there is no need for the instrumentalist to fold his theses and fade
away.

NOTES
1. The text for this sermon is of course: Ernest Nagel’s classic chapter, “The
Cognitive States of Theories” in Structure of Science, pp. 106-152.
2. See the article by C. Warren Hollister, “Greek Astronomy and Modem Sci¬
ence,” 1961, History, Volume 4, pp. 53-72. Also references in Roman Science, Uni¬
versity of Wisconsin, 1962, William H. Stube, pp. 159-162; The Mechanization of
the World, E. J. Diyksterhuis, Oxford, 1961, pp. 212-216.
3. Carnap’s discussion in Philosophical Foundations of Physics, pp. 254-256;
on Friedman, “The Methodology of Positive Economics,” Friedman, M.; “Assump¬
tions in Economic Theory,” Nagel, E.; reprinted in Readings In Microeconomics,
edited by W. Breit and Harold M. Hochman, pp. 23-47, pp. 60-68 respectively.
See discussions in Quantum Physics and the Philosophical Tradition—Aage Petersen;
M.I.T. Press, 1968, chapter 1, especially, pp. 12-20.
4. For Popper, K. R., “Three Views Concerning Human Knowledge,” reprinted in
Conjectures and Refutations, pp. 97-119. Smart, J. J. C., Between Science and Phi¬
losophy, Random House, 1968, pp. 121-163. See Feyerabend, Paul K., “Reply to
Criticism,” pp. 223-262, Boston Studies in the Philosophy of Science, Volume II,
1965.
5. The Philosophy of John Dewey—N. Y. 1939, Reichenbach’s “Dewey’s Theory
of Science,” pp. 157-192. “Dewey’s Answer,” pp. 534-45; 574-75.
6. Fists are often considered less real than hands because they are merely
temporary stages of them, bodies less real than the microscopic entities of which
they are composed (temporary stages of such bodies or rather those bodies tem¬
porarily hanging or staying together). The latter are real if they are permanent and
everything that is can be viewed as composed of them. When arguing against the
thesis of the reality of theoretical entities, Dewey, among other things, claimed that
there are no permanent entities, that everything that exists is an event. For a brief
and illuminating discussion about the reality of species which bears upon this dis¬
tinction see: “Retrospect of the Criticism of the Theory of Natural Selection,” by
Sir K. Fischer, pp. 107-110, in Evolution As A Process.
216 2. STRUCTURE OF SCIENCE

7. See the interesting discussion in Twentieth Century Science, edited by D. R.


Runes, Philosophy Library, 1947; essay by Victor F. Lenzer, pp. 107-129; John
Dewey in an essay by E. Nagel, “Dewey’s Theory of Natural Science,” pp. 231—248,
Philosopher of Science and Freedom, edited by S. Hook. Mach’s position is not
obviously Machism; at least according to some reports. Thus—Mach, the founder of
European physicalism, strenuously opposed the hypothesis of the molecular structure
of matter, on the grounds that molecules were not observable (at the time he was
right in that), and hence not legitimate building blocks for a physical theory. In
other words, molecules were “metaphysical.” The young Einstein, who at that time
was a great admirer of Mach’s but who was also very much concerned with the
molecular explanation of Brownian motion and similar phenomena, ventured to
journey to Mach’s Institute to raise this question: Suppose that the theory of molecu¬
lar structure did no more than predict a relationship between the coefficients of heat
conductivity and of viscosity which could be confirmed experimentally, would Mach
then accept the molecular hypothesis as a legitimate physical theory? Mach answered
affirmatively but apparently considered this contingency so improbable that he con¬
tinued to oppose the molecular theory in public for some time to come. (Delaware
Seminar in the Foundations of Physics, Page 3.)
8. For the best brief statement see W. V. Quine, “Ontological Relativity: The
Dewey Lectures,” Journal of Philosophy, Volume LXV, No. 7, April 4, 1968, espe¬
cially pp. 185-190.
9. Quine. Here I disagree with Quine Word Object (M.I.T. Press, 1960, pp. 248-
251). My discussion is closer in sympathy with W. J. Rankine, Miscellaneous Scien¬
tific Papers, p. 210, where he distinguishes between objective and subjective hy¬
potheses (London, 1881, Charles Griffin and Company).
10. Smart—see Between Science and Philosophy and references quoted therein.
11. Popper’s discussion, though more qualified than indicated in the text, is sub¬
ject to related criticism. See “Prophecy and Prediction in The Social Sciences,” pp.
336-347, in Conjectures and Refutations.
12. Scientists employ a double standard. When they say they accept a theory T,
they are to be interpreted, I think, as believing that T if proven false, will be shown
to be a limiting case of a more inclusive theory. In the interim they may be inter¬
preted as accepted causal-schematic statements. Examine for example: A physical
theory seems to have only a finite lifetime. Newtonian mechanics was replaced by
relativistic mechanics, thermodynamics by statistical mechanics, classical by quantum
mechanics. But did these theories become wrong after having been proved correct
over such a long time? What actually happened was that they continued to provide
the correct predictions, but only for a limited set of phenomena. For example,
Newtonian mechanics became restricted to phenomena in which the velocities are
small compared with the velocity of light. It became an approximate theory. Whether
the approximate nature becomes apparent in a given instance depends entirely on
the accuracy of the experiment.
Thus we learn that an established theory in a certain sense never becomes wrong;
it only becomes restricted to a domain of validity (Classical Charged Particles, F.
Rohrlich, pp. 3 and 4).
13. The whole story about Gibbsian ensembles is complicated. Thus: That gen¬
eral procedure of statistical mechanics with which is associated the name of Boltz¬
mann is concerned with a single dynamical system and uses statistical methods to
make calculations pertaining to such a system. The treatment introduced by Gibbs
deals, on the other hand, with collections of similar systems, together with an
appropriate distribution function, and calculates averages over such ensembles of
systems; the statistical considerations are thus much more fundamental in this ap¬
proach. Gibbs, however, refrained on the whole from attributing physical significance
to his concept of an ensemble, and regarded the behavior of an ensemble of systems
as being but formally analogous to that of a single physical system. Although Ein¬
stein introduced ensemble theory independently of Gibbs and did consider the
average behavior of an ensemble to be indicative of the behavior of a single physical
system, yet the method was regarded by some as being lacking in what Fowler
termed the “ ‘sanity’ or physical reality” of the Boltzmann approach. The concept
of the ensemble received some physical import, however, from Tolman’s stressing
that its use is required by the limitations of physical measurement. (Ergodic Theory
in Statistical Mechanics, I. E. Farquhar, p. 7.)
14. For a typical case about the realism of the sentences, see:
a) Ergodic Theory and Information, Patrick Billingsley, p. In. It is hard to see how
to fit P into the frequency theory of probability: the grand experiment cannot
be repeated, since one replication uses up all of time. No matter: although the
SIDNEY MORGENBESSER 217

mfimteness of the sequence of component experiments is mathematically crucial,


it must not be taken literally in real applications.
For a most illuminating statement, see:
b) B. D’Espagnat, “Things, Structures and Phenomena in Quantum Physics,” p.
377, in Logic, Methodology and Philosophy of Science, edited by Van Rootselaar
and Staal.
“They therefore demand that a description of a physical law or principle should
be objective in the sense that it should not refer, not even implicitly, to any
specific abilities or inabilities of the ‘observer.’ ”
15. Thus we need face four problems before we come to a conclusion about the
realist interpretations of scientific theory.
(1) The “realism” of the mathematics—see e.g., Volume and Surface Integrals
Used in Physics, J. G. Lutnen, pp. 4-5.
(2) Ontology
Since many laws and theories are presented in equation form it is often hard
to know what entities are assumed. Recall that Sir Joseph Larmor pointed out
that a set of equations equivalent to Maxwell’s was first used by MacCullagh
in 1838 as a scheme consistently covering the whole ground of “Physical Optics,”
Collected Works of James MacCullagh (1880), p. 145.
Recall that many entities are assumed only to be explained away (e.g., Weiss’
Theory of Molecular Fields), and that often the disposition of the entities are
not well-anchored. See, for example, H. A. Lorentz, Theory of Elections, p. 2.
(3) The “meaningfulness” of the predicates—see e.g., L. Boltzmann, Lectures on
Gos Theory—Introduction and letter quoted, pp. 15-17, Translated by Stephen
G. Bruh, University of California Press.
(4) The status of the ultimate entities—see e.g., Fields and/or Particles, D. K. Sen,
The Ryerson Press, Conclusion.
16. See the discussion of the case in electro-magnetism in J. H. Van Vleck,
Electric and Magnetic Susceptibilities, p. 1, footnote, Oxford, 1932. Also Morris
Kline, Irwin W. Kay, Electromagnetic Theory and Geometrical Optics, pp. 325-
330. I overlook the problem of a general space-time physics as developed by
Wheeler and his students, but see The Concept of Matter—edited by Ernan McMul-
lin, University of Notre Dame Press, 1963; essay by Charles D. Misner, “Mass as
a Form of Vacuum,” pp. 596-609.
17. See essay by E. W. Adams on “The Foundations of Rigid-Body Mechanics
and The derivation of its Laws from those of Particle Mechanics,” pp. 250-265, in
The Axiomatic Method, edited by Leon Henkin, Patrick Suppes, Alfred Taiski,
North-Hollow, Ann Arbor, 1959.
18. For one example—see the discussion by Frank Restle in “Siegel’s Contribution
to Learning Theory.” One type of choice theory says that the response chosen will
maximize or optimize something. (Decision and Choice, Contributions of Sidney
Siegel, p. 276.)
19. See discussion in Optimization in Control Theory and Practise, I. Gumoski
and C. Mira, p. 15.
20. For more sophisticated examples—see A Logical Analysis of the Theory of
Relativity, Hakan Torubaum, Stockholm, 1952, pp. 180-193.
A fuller discussion would require distinguishing between regression equations,
lawlike, causal-schematic statements and laws which define their terms. See for the
latter issue, Schrodinger’s discussion, p. 65, in Scientific Papers, presented to Max
Born also Herman Weyl, Philosophy of Mathematics and Natural Science, p. 148.
21. Recent discussions have tended, I think, to overemphasize the revisability of
theories. Note, Einstein’s opinion, “If a single one of the conclusions drawn from it
[general relativity] proves wrong, it must be given up; to modify it without destroying
the whole structure seems impossible.” Quoted in Jaki, The Relevance of Physics,
p. 246, University of Chicago Press, 1966.
22. For a development of this theme see “Social Inquiry and Moral Judgment,”
S. Morgenbesser, pp. 180-200, in Philosophy and Education, edited by I. Scheffler.
23. The crucial question is not whether coefficients of production are, in a de¬
scriptive sense, rigorously fixed—quite obviously they are not—but whether treating
them as if they were yields good predictions; whether, that is, treating them as if
they were involves neglecting factors that are only “minor” disturbances or involves
throwing the baby out with the bath water. It is sometimes argued that, if predic¬
tions that neglect changes in coefficients of production turn out to be bad, this
“difficulty” can be overcome by complicating the analysis, for example, by introduc¬
ing changes (generally by unspecified methods) into the entries in the input-output
table before using it for prediction, or by substituting linear programming models.
218 2. STRUCTURE OF SCIENCE

This seems to me highly misleading: substituting linear programming or the other


devices is a retreat, not the surmounting of an obstacle; it means that this particular
simple hypothesis has been a failure, and that we are therefore forced to turn to
more complex—and by the same token less useful—models, or to try to achieve
simplicity in some other way. (Input-Output Analysis: An Appraisal, National
Bureau of Economic Research, p. 171.)
24. “The primary test of validity is a pragmatic one—does the description fit
experimental fact. Many systems are most conveniently treated as continua for some
purpose and as systems of discrete particles for others.” See H. Wayland, Differential
Equation, Van Nostrand, 1957, p. 4.
25. Thus, the maximum principle of control theory and the principle of optimality
of dynamic programming are equivalent. The EULER-LAGRANGE conditions of the
calculus of variations are a special case of either.
Mathematical equivalence is not tantamount to equal usefulness in practical
application. A great convenience of the dynamic programming approach is the ease
of formulating the principle of optimality and the fact that the maximization required
by the principle of optimality may be performed directly. Disadvantages arise when
the number of state variables is large. On the other hand, the classical calculus of
variations sometimes generates explicit solutions, and control theory permits an
analysis in terms of phase diagrams. (Dynamic Programming of Economic Decisions,
Martin J. Beckmann, p. 134.) The mathematical equivalence of the field theory
approach and the field quantum approach has been shown by Dyson. The use of the
concept of the field quantum enables us to discuss a physical process in terms of
emission and absorption of field quanta. This makes the calculation of the transition
probability for a particular process a much easier task than it would have been for a
pure field theory (Stellar Physics, Hong-Yee Chiu, Volume 1, p. 248)
26. See for example: Tomonava—Sin. ITIRO, Quantum Mechanics, John Wily,
1966, Volume II, p. 253. “Thus, the acceptance of the universal validity of the un¬
certainty relation inevitably means the abandonment of the attempt to introduce the
hidden parameter to explain the diffraction.”
27. My position here is dose to the one defended by Levins, Changing Environ¬
ments, Chapter 1.
THE IDENTITY THESIS1
Judith Jarvis Thomson

There may well be differences between the mental and the physical
which rule out identifying them. The Identity Thesis issues from felt
peculiarities in the mental, felt differences between it and the nice, solid,
familiar, locatable physical, and so it is hard to see how it could never¬
theless turn out to be the physical after all. But what I want to try to bring
out in this paper is a difficulty for the Identity Thesis which does not arise
from any special features of the mental or the physical; the difficulty is,
rather, as one might put it, the form of the thesis. In brief, it is not clear
that claims of that form can be anything more than mere reports of cor¬
relations, whatever the kinds of things they purport to identify.
1. The Identity Thesis is often put forward in some such words as “Sen¬
sations are brain processes (or brain events or brain states).” If sensations
are brain processes, then I take it to follow that toothaches are. But which
brain processes are they? If I am not mistaken, it is not yet known precisely
what is going on in a man’s brain when a tooth aches, and hence I shall
just take the liberty of inventing some physiology. I do not think that the
fact it is so (wildly) over-simple will matter for the points I wish to make.
So let us imagine, then, that there is found a certain chunk of brain—call
it the “dentral chunk,” or “d.c.” for short—to which are attached nerves
coming from each tooth. It is found that whenever a man has a toothache
his d.c. is in a certain state, viz. the state of being swollen; and let us imag¬
ine that the onset of his toothache is contemporaneous with the onset of
the swelling. (This could be made more specific: If it is his upper front
teeth that are starting to hurt, then his d.c. is starting to swell at the place
at which the nerves from the upper front teeth join it.) If the toothache is
a throbbing, pulsating ache, then the d.c.’s swelling varies accordingly:
alternately swelling and shrinking. Lastly, let us imagine that physiologists
do put forward, as psychophysical laws, correlations between toothaches
and swellings of d.c.’s, and between variations in toothaches and variations
in swellings of d.c.’s.
Now there is a possible interpretation of the Identity Thesis—call it the
“Naive Interpretation”—which takes it literally: Toothaches themselves
(as also itches, tingles, and so on) are brain processes, events, or states.
Thus, for example, toothaches are swellings of d.c.’s.
219
220 2. STRUCTURE OF SCIENCE

But the Naive Interpretation is surely at best false. No doubt one does
not naturally say that one’s toothache is in one’s tooth (where else could a
toothache be?), but it does come perfectly naturally to say, and would on
occasion be true to say, that one has an ache (or perhaps even a tooth¬
ache) in one’s front tooth, a pain in one’s knee, and an itch in one’s nose.
It is not, of course, straight off clear what we should say about the location
of processes, events, or states, but at any rate it surely will not do to say
that the process of swelling currently being undergone by a d.c. is in its
owner’s front teeth; or that the event, its starting to swell, is there; or that
the state of the d.c., its being swollen, is there. But if that is where the ache
is, then the ache is not identical with the process, event, or state of the d.c.
But at the same time, it may, I take it, be supposed that no defender of
the Identity Thesis—“Materialist,” for short—has ever meant by the Identity
Thesis what I am calling the Naive Interpretation. The example of a “sen¬
sation” which Professor Smart, for example, deals with in “Sensations and
Brain Processes”2 is the having of a yellowish-orange after-image rather
than toothache, and he says: “I am not arguing that the after-image is a
brain-process, but that the experience of having an after-image is a brain-
process.”2 One supposes he would say the same of aches and pains: It is
not the ache itself but the experience of having an ache which is a brain
process. Now the experience of having an ache in one’s front teeth is not
itself in the front teeth, so the location difficulty does not arise: If Smart
does wish to say about toothaches what he said about after-images, then he
is not committed to that which would imply that a swelling of a man’s
d.c. may well be in his front teeth.
But that the Naive Interpretation is at best false, and that no Materialist
(as I take it) has ever meant the Identity Thesis to be construed in that
way, is important; I shall return to the consequences of this in the following
section.
So it is to be the experience of having a toothache which is a brain
process. More specifically, in the light of my (imaginary) physiology:

(Tj) A man’s having a toothache is his d.c.’s being swollen,


and

(To) Events which consist in a man’s starting to have a toothache are events
which consist in his d.c.’s starting to swell,

and so on. And these are, of course, not to be construed merely as (odd)
ways of reporting the correlations I described at the outset; they are to be,
literally, identity claims.
One gain of saying these things would be this: we no longer have to
worry about how d.c. events could cause toothaches. If the car crash you
saw from your window was the very same as the car crash I saw from mine
JUDITH JARVIS THOMSON 221

then it is ruled out that one caused the other. More generally, there no
longer is a “mind-body” problem: we no longer have to worry about how
bodily events could cause mental events. Depending on the events in ques¬
tion, we can say either “They don’t, for they are the same”, or “They do,
but this is just bodily ones causing other bodily ones.”
A second gain is supposed to be4 a scientific simplification: we shall be
able to eliminate “nomological danglers,” i.e., irreducible laws correlating
mental and bodily events. “I cannot believe,” Smart says,5 “that ultimate
laws of nature could relate simple constituents [such as toothaches] to con¬
figurations consisting of perhaps billions of neurons (and goodness knows
how many billion billions of ultimate particles. . . If the mental just is the
physical, then we have no need of such laws; toothaches are not correlated
with d.c. events but are them.
2. But what can a man mean who says (TO and (T2)? It surely is not
plain on the face of it. Nor do the writings of the Materialists give us much
help. We are told straight off they do not mean that “has a toothache”
(or “onset of a toothache”) entails or is entailed by “has a swollen d.c.”
(or “onset of a swelling of a d.c.”): (TO and (T2) are meant to be con¬
tingent identity claims. We are next given a series of identity claims, of
which the following is, I think, a fair sample:
(a) Jones’ table is an old packing case.
(b) The morning star is the evening star.
(c) Duly elected presidents of the United States are commanders-in-chief
of the United States armed forces.
(d) Clouds are masses of water droplets or other particles in suspension.
(e) That table is a cloud of molecules.
(f) Water is H20.
(g) Temperature is mean kinetic energy.
(h) Lightning is an electrical discharge.
But it is not plain what we are to do with these examples. If the point of
putting them before us was merely to remind us of the fact that there are true
contingent identity claims, then they do certainly succeed in this. But we
may have clearly in mind the fact that there are true contingent identity
claims, and yet still feel very unclear as to what (Ti) and (T2) mean. Are
(a) through (h) supposed to shed light on what is meant by (Ti) and
(T2)? How are they to do this? They do not identify states or events. The
subject-terms of (a) through (e) stand for physical objects; “water” stands
for a kind of stuff; “temperature” for a feature of things; and “lightning”—
well, it is hard to say what category lightning falls into, but at all events the
word does not stand for either a state of a thing, or for an event or set of
events.
We could easily construct from each of these a statement asserting identity
between states or events. Thus, for example, from (a):
222 2. STRUCTURE OF SCIENCE

(A:) Jones’ table’s being red is an old packing case’s being red,

and

(A2) Events which consist in someone’s painting Jones’ table are events
which consist in someone’s painting an old packing case.

And from (f):

(Fj) A man’s drinking a glass of water is his drinking a glass of HoO,

and

(F2) Events which consist in a man’s pouring water are events which con¬
sist in his pouring H20.

But these are useless as aids to the understanding of (Tx) and (T2): for
the capital-letter examples constructed from the small-letter examples in
this way are true if and only if the small-letter ones are true. But is there
supposed to be an analogous small-letter example (t), such that we could
say that the T’s are true if and only if (t) is true? The candidates will have
to have the form:

(t) Toothache is ... (or: Toothaches are .. .)

But if these dots are filled in with “a swelling of a d.c.” (process or event),
or “a swollen state of a d.c.” (state), then (t) itself is the Naive Interpreta¬
tion. And that, as we said, is at best false. No Materialist means to be
putting it forward in putting forward the Identity Thesis.
My point, then, is this: Though (a) through (h) are contingent identity
claims, they do not identify states or events and hence do not shed light
on the meaning of the T’s. By contrast, the A’s and F’s do identify states
and events; but they are true if and only if (a) and (f) are true—and so
if we are to take the analogy seriously, we shall have to say that the T’s
are not true, for (t), the Naive Interpretation, is not true.
There might be an objection to this on the score of (h), and so—and in
view of the fact that it is perhaps the most frequently offered of all these
examples—we ought to digress and take a closer look at it.
“Lightning is an electrical discharge.” Ought we not first ask whether
this is contingent? The Concise Oxford Dictionary gives as its first reading
for “lightning”: “Visible electrical discharge between clouds or cloud and
ground.” But if (h) is analytic, then it certainly would not help in the way
of making the meaning of (Tx) and (T2) any clearer.
There might be an argument as to whether or not the Concise Oxford is
right. But we can sidestep this; we can say that whatever the word does
mean, we are to construe it in (h) as not meaning what the Concise Oxford
says, but instead as meaning “flash of light in the sky.” Or, since not all
flashes of fight in the sky are lightning flashes, “flash of fight in the sky of
kind L.” Or, to avoid the locution “of kind L,” we can instead deal with a
JUDITH JARVIS THOMSON 223

particular claim which a man might make, referring to a flash of light in


the sky which was of kind L:

(h*) That flash of light in the sky just now was an electrical discharge.

Now it might be said that while “flash of light” does not seem to stand
for an event, “electrical discharge” does—for we can ask of an electrical
discharge such things as when it took place—and thus that perhaps this
example at any rate escapes the objections I raised to (a) through (h).
But the trouble with (h*) for the Materialist’s purposes is that its truth
on a given occasion is compatible with the truth of: “That flash of light
in the sky just now was caused (or produced) by an electrical discharge.”
Compare (h*) with the following:

(x) That flash of light in the sky just now was a meteor shower.
(y) That sound just now was someone clapping his hands.
(z) That lovely smell in the kitchen now is garlic.

These may on occasion all be said, and said truly. But on any such occasion,
their truth is compatible with the truth of “That flash of light was caused
by a meteor shower,” “Someone’s clapping his hands was what made that
sound,” “Garlic in the stew is what’s responsible for the lovely smell in
the kitchen now.” (Indeed, there may even be some who would want to
say that (x), (y), and (z) are elliptical for the longer causal claims.)
If I am not mistaken, what gets said nowadays is

(i) Light is a form of electromagnetic radiation.

An electrical discharge from cloud to ground may cause a flash of light


in the sky, but so may other things too, such as starting a bonfire in front of
a polished reflector. This does not mean that we would be speaking falsely
if we said, “That flash of light in the sky just now was an electrical dis¬
charge,” or “. . . was our bonfire,” any more than the fact that sound is a
wave motion makes it false to say (y). We often truly say such things as
these—where we could equally truly (if more pedantically) have made the
longer causal claims instead.
But then (h*) sheds no light on the meaning of (TO and (T2). For they
are not supposed to be compatible with saying that toothaches are caused
by swellings of d.c.’s: It was precisely to rule out that the relation is causal
that they were put forward in the first place.
It might be suggested that what this shows is that we should have been
considering (i) instead of (h). But I think it will be clear that the remarks
I made above about (a) through (h) hold for (i) as well—its subject term,
“light,” does not stand for either a state of anything or an event or set
of events.
3. My point thus far has been only that it is not plain on the face of
it what is meant by (Ti) and (T2), and that the contingent identity claims
224 2. STRUCTURE OF SCIENCE

which are most commonly offered as aids to the understanding of (Ti) and
(T2) do not help us to understand them.
If it were merely a matter of finding a contingent identity claim which
asserts identity between states or events, the difficulty would soon enough
be eliminated, for they are not at all hard to come by. Consider, for example
(J) The sinking of the Titanic was the largest marine disaster ever to occur
in peacetime.6
And
(K) Car crashes which Jones sees from his window are car crashes which
Smith sees from his,

which we can imagine is true. Both of these, I take it, could be said to
assert identity between events. But, as I shall argue, they do not help. I
want to try in this section to give a ground for saying that nothing will.
To begin with, let us go back to (Ti) again. (Ti) said that a man’s
having a toothache is his d.c.’s being swollen. Whatever this may mean,
it is supposed to be an identity and we can presumably operate with it by
the usual rules. So let us notice that “x’s tooth aches” is equivalent to “x has
a toothache,” and therefore, as I take it, we can say that a man’s tooth’s
aching is his having a toothache.7 From this, together with (Ti), there fol¬
lows by transitivity of identity

(Ux) A man’s tooth’s aching is his d.c.’s being swollen.

Similarly, we can say that a man’s tooth’s starting to ache is his starting to
have a toothache; and from this, together with (T2), there follows
(U2) Events which consist in a man’s tooth’s starting to ache are events
which consist in his d.c.’s starting to swell.

So then if (Ti) and (T2) are true, (Ux) and (U2) are true; and a man who
says (Ti) and (T2) should also be ready to say (Ux) and (U2). But what
could a man mean who said these?
(Ux) may be said to be of the form “A’s F-ing is B’s G-ing”; and what
perhaps strikes one first is that, in it,

1. “A” and “B” refer to distinct and different physical objects,

for a man’s tooth and his d.c. are distinct and different physical objects, and
2. “x is F’ and “x is G” do not entail each other; they are not contin¬
gently identical; they are not even materially equivalent,

for my left hand now aches, but is not swollen.


It might be objected at this point that I have got this wrong—that we
should be considering neither (Ux), nor even my initial (Ti), but rather
(Vx) A man’s having a toothache is his having a swollen d.c.

The states referred to in (Vx) are states of the same thing, a given man,
and the Materialist’s very point is that the properties “x has a toothache”
JUDITH JARVIS THOMSON 225

and “x has a swollen d.c.” are contingently identical. (They are at least
materially equivalent: By hypothesis, the physiological facts include that
they are.)
But as we said, a man’s tooth’s aching is his having a toothache. We can
surely also say that a man’s having a swollen d.c. is his d.c.’s being swollen.
From these, together with (Vi), we can deduce (Ui) by transitivity of
identity. So, as in the case of (Ti), a man who says (Vi) should also be
ready to say (Ui). If he is not prepared to use the transitivity of identity to
obtain consequences from (Vi), we shall have a right to wonder why he
uses “is”—and indeed, allows his general thesis to be called “The Identity
Thesis.”
To return, then, to (Ui), I mentioned above two features which it pos¬
sesses. There are more which should be noted.
3. Neither F nor G is a relational property.8
4. “A is F” neither entails nor is entailed by “B is G.”

Next is a feature which we know (Ui) is supposed to have because of what


we know that the Identity Thesis is intended to do. It is intended to rule
out our saying that brain states or events cause mental states or events,
and vice versa; so (Ui) must be construed as incompatible with saying
that the tooth’s aching causes or is caused by the d.c.’s being swollen:
5. The truth of “A’s F-ing is B’s G-ing” is incompatible with either A’s
F-ing or B’s G-ing causing the other.

Indeed, we are not to speak of the mental state (the tooth’s aching) and
the physical state (the d.c.’s being swollen) as two; hence we are not to say
that they are two joint effects of a common cause:
6. A’s F-ing and B’s G-ing are not two joint effects of a common cause.

Are there any respectable identity claims which we should regard as


true which are of the form “A’s F-ing is B’s G-ing,” and of which 1
through 6 could be said? I am not getting at a difficulty which is due to the
Materialist’s identifying the mental with the physical; let it be an identity
claim which identifies the physical with the physical, or whatever you like.
Well, people do sometimes say things like
(L) So, as you can see, this electrode’s being positively charged is that
electrode’s being negatively charged,9

and
(M) Lady Bird’s being First Lady just is Lyndon’s being President,10

and they do get by perfectly well. Nobody boggles when such things are
said.
There is room for argument about this, of course. For example, it might
be said that 6 is not true of them: In both cases what we have is two joint
effects of a common cause. But we need not argue this. Let us ask instead
226 2. STRUCTURE OF SCIENCE

what would be the point of saying (L) or (M)? Better, consider a man
who says (L) or (M); shall we not suppose that he thinks there is a reason
why this electrode’s being positively charged “is” (as he puts it) that
electrode’s being negatively charged? An explanation? Perhaps he would
refuse to call it a causal explanation; but does he think that there is an
explanation of some kind or other of the fact he asserts?
The mere possibility of giving an explanation does not by itself rule out
that a claim should be an identity claim. There are identity claims which
may well be true in respect of which it would be in place to ask “How
come?” For example, we might be told that car crashes Jones sees from his
window are all of them car crashes which Smith sees from his. We might
ask “How come?” and be told that it just turns out that way, a sheer
coincidence; but we might instead be told that they are both professional
ambulance chasers and, save while actively chasing ambulances, are both
at their (neighboring) windows all day. So I am not, in other words, going
to argue that if a man says (L) and (M), thinking there is an explanation
of how come this electrode’s being positive “is” (as he puts it) that elec¬
trode’s being negative, how come her being First Lady “is” his being Presi¬
dent, then it follows that he does not make an identity claim or that he
ought not have used the word “is.”
In the case of (M), there plainly is an explanation—namely, that
Lyndon is married to Lady Bird and, by custom, whatever female is
married to the President is First Lady. In the case of (L), there will be
in the right sort of context. The speaker himself might say: “Oh, I thought
you knew. There’s a dilute solution of sulphuric acid in the cell, and . . .
This electrode is copper, and that one is zinc. . . . Now in fact, the copper
dissolves more slowly than the zinc. Up here is a wire connecting the two
electrodes. Etc.” The very point of his using the word “is” might be to
stress that if it is true to say “A is F” it must also be true to say “B is G.”
But now suppose we ask for an explanation in the case of (Ui)? How
come it is a tooth’s aching, and not, for example, a foot’s aching which is
the d.c.’s being swollen? We might be told, “Well, it’s the teeth which are
neurally linked with the d.c.; the feet are not.” But the trouble with this
is that it is not clear how it is supposed to explain what was up for ex¬
plaining. The existence of a neural link between the teeth and the d.c.
would explain why damage to the teeth causes a swelling of the d.c.; the
nonexistence of a neural link between the feet and the d.c. would explain
why damage to the feet does not cause a swelling of the d.c. But how does
that explain why it is the teeth rather than the feet that ache?
This comes out more clearly if we imagine that our question was not
as to why the teeth rather than the feet but instead why an aching rather
than, for example, a prickling. If it is said, “Ah, but you see, the swelling
is of a special kind, kind K,” then it is plain we stand in need of something
to connect achings with swellings of kind K.
JUDITH JARVIS THOMSON 227

According to the familiar model for explanation, what should be called


upon to connect them is a general law, such as: Achings of part P of the
body are accompanied by, or correlated with, kind-K-swellings of that
chunk of the brain to which P is neurally linked. The law appealed to must
not be causal—must not say that achings are caused by kind-K-swellings—
for, of course, this would be to defeat the purpose of putting forward the
Identity Thesis. As I said, one thing which was to be gained was to rule
out that mental events are caused by bodily ones. But it will be remembered
that I mentioned a second thing which was to be gained if we accept the
Identity Thesis: We were to achieve a scientific simplification.11 In fact,
the simplification we are to achieve is supposed to consist in our being
able to eliminate “nomological danglers,” i.e., laws correlating mental and
bodily events. But then to appeal to the general law “Achings are accom¬
panied by swellings” as an explanation of (Ui) would be self-defeating
too, for then here would be one correlating law at all events which we
could not do without.
We could try to explain (Ui) not by appeal to a general law but instead
by appeal to a general identity claim, such as: Achings of part P of the
body are kind-K-swellings of that chunk of the brain to which P is neurally
linked. But this is surely no less obscure than (Ui) itself—if we did not
understand (Ui), we shall certainly not understand this either. Is the pos¬
sibility of making statements like (L) and (M) (page 225 above) supposed
to shed light on what it means? (For the sample identity claims of the
preceding section—such as “Jones’ table is an old packing case”—cer¬
tainly do not.) But then are we to suppose that, as in the case of (L) and
(M), it can be explained how come achings-of-P are kind-K-swellings?
Are we to appeal for the explanation of it to some yet higher-level identity
claim?
Sooner or later we would reach an identification between mental and
physical of which we would have to say, “This seems to be just as a matter
of fact true—no explanation, no reason why, so far as anybody knows.”
Why should achings be kind-K-swellings rather than kind-K'-swellings?
Perhaps it goes further, perhaps it stops here, with “Nobody knows. They
just are.,, Rather than shifting examples in mid-stream, I shall suppose it
is (UJ itself of which this is said, and hence add to the list of features pos¬
sessed by (Ui):

7. So far as anybody knows, A’s F-ing just happens to be B’s G-ing; we


have no explanation and know of no reason why.

This rules out an analogy between (Ui) on the one hand and (L) and
(M) on the other. For we are supposing that one who says (L) or (M)
can give, or at least thinks that there is, an explanation “how come.”
(Ui) is of the form “A’s F-ing is B’s G-ing,” where the replacements
for the variables A, B, F, and G are supposed to be such as to make condi-
228 2. STRUCTURE OF SCIENCE

tions 1 through 7 true. I have said nothing yet about (U2), but a similar
thing can be said about it. (U2) is of the form “Events which consist in
A’s starting to F are events which consist in B’s starting to G”—or, more
simply, “A’s starting to F is B’s starting to G”—where A, B, F, and G
are supposed to meet the same conditions 1 through 7. Are there any re¬
spectable identity claims which we would regard as true of which either of
these two things can be said? Pretty plainly the two event identity claims
(J) and (K) which I gave at the beginning of this section cannot be so
described, for they are not of the right form.
Suppose someone were to say

(K*) Jones’ seeing a car crash from his (Jones’) window is Smith’s seeing a
car crash from his (Smith’s) window.

As a form of words, this would probably get by. But what could a man
mean by it, if he also says that 1 through 7 are true of it? If he says that
5, 6, and 7 are true of it, then we cannot suppose he means to be pointing
to a causal connection between Jones’ seeing a car crash and Smith’s see¬
ing a car crash; indeed, he must say that so far as is known it just happens
that Jones’ seeing a car crash “is” Smith’s seeing a car crash. So his remark
cannot be supposed to have the point which we imagined (L) and (M) to
have. But then what could he mean to be saying?
I am inclined to think that we can only understand what he says if we
suppose him to be saying no more than that whenever Jones sees a car
crash from his (Jones’) window, Smith sees a car crash from his (Smith’s)
window. And that if he rejects this—if he says, “No, I don’t mean ‘when¬
ever’; I said ‘is,’ and I meant ‘is’ ”—then we shall simply not understand
what he means.
It could perhaps be put like this. Presumably (K*) will not be true unless
“Whenever Jones sees a car crash from his (Jones’) window, Smith sees
a car crash from his (Smith’s) window” is true; i.e., presumably (K*)
entails the correlation-statement. But if (K*) is to mean something more
than the correlation-statement, we shall have to suppose it is not entailed
by it. So the correlation-statement could be true and (K*) false. But in
view of the way in which we must regard (K*), 1 through 7 being true of
it, what could be lacking to make it true if it is given that the correlation-
statement is true? Suppose it a coincidence that whenever Jones sneezes,
Smith coughs; under what conditions would it be true to say “Jones’ sneezing
is Smith’s coughing,” and under what (different) conditions would it be
true to say “Admittedly whenever Jones sneezes, Smith coughs, but it is
not the case that Jones’ sneezing is Smith’s coughing?”
So far as I can see there is no way in which it could be proved that a
man who said (K*), and said that 1 through 7 are true of it, could mean
no more by what he said than just that whenever Jones sees a car
crash . . . , Smith sees a car crash. . . . But at the same time, we can
JUDITH JARVIS THOMSON 229

surely say this: Pending our being given an account of what more could
be meant, we are entitled to the suspicion that nothing more is meant.
My point, then, is this. (Ui) has a certain form, and 1 through 7 can
be said of it. But—pending further explanations on the part of the Material¬
ists—it looks as if no claim which is like (Ui) in this respect can be any¬
thing more than a report of a correlation—whatever it may purport to
identify, whatever kind of thing it may say “is” thus and such other kind
of thing. If I am right, then, in the absence of further explanation, in the
absence of an account of how claims like (Ux) in this way can be con¬
strued as anything more than reports of correlations, we have a right to a
suspicion that (Ui) would have to be construed as a report of a correla¬
tion too.
What we began with was (Ti) and (T2); and I said of them that it is
not plain on the face of it what a man could mean who said them. Nor
was it plain on the face of it what a man could mean by (Vi) and (V2).
The Materialist did not mean merely to be saying that whenever a certain
kind of mental event takes place, a certain kind of bodily event takes place;
he wants to say that the one is the other—(Ti), (T2), (Vi), (V2) are to
assert identity. (Tx), for example, is supposed to say something stronger
than just that whenever a man has a toothache, his d.c. is swollen. We
noted, however, that a man’s tooth’s aching is his having a toothache; we
can say that it is this and not merely that it is correlated with this. So if
(T'i) is stronger than a mere correlation-report, (Ui) ought to be so too,
identity being transitive. Thus the ground there is for thinking that (Ui)
cannot be more than a correlation-report will also be ground for thinking
that (Ti) cannot be more than a correlation-report. And similarly for
(T2), (Vi), and (V2). The difficulty which faces a man who says (Ui)
is also a difficulty for a man who says these other things.
It remains to call to the reader’s attention the fact that there is not going
to be any move comparable to this in the case of the K’s. (K), it will be
remembered, was: “Car crashes which Jones sees from his window are
car crashes which Smith sees from his.” Surely no remarks about event-
identity ought to have as a consequence that this is not an identity claim;
what this says is not that the crashes are correlated with each other but
precisely that they are the same. (K*) was “Jones’ seeing a car crash from
his window is Smith’s seeing a car crash from his window,” and I said that
if we are to suppose 1 through 7 are true of it, it looks as if (K*) cannot
say any more than that whenever Jones sees a car crash from his window,
Smith sees one from his. But the fact (if it is a fact) that this is true of
(K*) does not make trouble for (K), for there is no move from (K)
to (K*) analogous to the one I made from (Ti) to (Ui). A man’s tooth’s
aching is his having a toothache; from this together with (Ti) we can
obtain (Ui) by transitivity of identity. From (K) together with “Jones’
seeing a car crash is a car crash which Jones sees” and “A car crash which
230 2. STRUCTURE OF SCIENCE

Smith sees is Smith’s seeing a car crash,” we can obtain (K*) by transitivity
of identity. But the quoted statements are surely at best false. Jones’ seeing
a car crash is not a car crash at all, and hence a fortiori not a car crash
which Jones sees. No car crash is Smith’s seeing a car crash, and hence
a fortieri no car crash which Smith sees is Smith’s seeing a car crash.
.
4 On p. 225,1 merely mentioned, but did not discuss an important con¬
dition on F and G, namely 3: Neither F nor G is a relational property.
This condition is important; if it is omitted, then there may well be re¬
placements for the variables which give what look to be quite respectable
identity claims. Suppose, for example, that the relation which Henry most
fears to have to anyone is “being hated by.” And suppose Sarah hates
Henry. Then I think we can say that Sarah’s hating Henry is (strictly
speaking the very same state of affairs as) Henry’s having to Sarah that
relation which he most fears to have to anyone. (For his having that rela¬
tion to her is, as a matter of fact, his being hated by her; and this is
equivalent to her hating him.) It might be argued that even 7 could be
supposed true of this example.
So it might perhaps be said that the Materialist need merely restate the
Identity Thesis. I have all along been construing it as the thesis that
mental states and events are brain states and events. But perhaps it should
read: Mental states and events are brain-and-body states and events. In
particular, that (Uj) and (U2) be replaced by

(Wj) A man’s tooth’s aching is his d.c.’s having relation R to that tooth

and

(W2) Events which consist in a man’s tooth’s starting to ache are events
which consist in his d.c.’s starting to have R to that tooth.

Only one of the two predicates in (Wx) is relational, but perhaps there
might be found examples of this kind along the lines of the Henry-Sarah
example of the preceding paragraph.
But in fact it really will not help to reput the Identity Thesis in this form.
To begin with, consider (W2). One thing we can surely say is that for
(W2) to be true, it must be that whatever is the time—say t—at which
the tooth starts to ache must also be the time at which the d.c. starts to
have relation R to the tooth. I am not saying that it is always possible to
date the start of a toothache precisely; I am saying only that whatever the
time assigned to the start of the toothache would have also to be the time
of the start of the d.c.’s having R to the tooth. It seems reasonable to
suppose that event E cannot be identical with event E' unless E and E'
are contemporaneous.
What relation shall we say the d.c. starts to have to the tooth at t, with
which we can identify the tooth’s starting to ache at t? A tooth is drilled
or chilled or damaged in some way; this causes a “message” (an electrical
JUDITH JARVIS THOMSON 231

impulse? I shall just vaguely call it a “message”) to be sent up the neural


canal to the brain—to the d.c., in my physiology. But whatever the nature
of the message, two things are plain. First, it takes time for the message
to get from the tooth to the d.c., and second, the tooth does not start to
ache until the message arrives at the d.c. If we imagine that the message
takes a long time to reach the d.c., then we can suppose that by the time
the tooth starts to ache at t, the tooth is no longer sending any messages
to the brain—for example, the dentist has removed the drill or the tooth
has warmed up. So what could R be?
If we retain the rest of the physiology I invented earlier, we can sup¬
pose that what happens when the message arrives at the d.c. is that the d.c.
starts to swell; and we can suppose that the d.c. starts to swell at t, which
is when the tooth starts to ache. Now we can construct from this a descrip¬
tion of a relation which we could say the d.c. starts to have to the tooth
at t: The d.c. is swollen-in-consequence-of-receipt-of-a-message-from the
tooth. Or, in view of the possibility that the same message could come
not from the tooth but from a place on the nerve between the tooth and
brain: The d.c. is swollen-in-consequence-of-receipt-of-a-message-which-
comes-along-a-nerve-attached-to the tooth. Or we could imagine that the
place of the swelling is what is relevant: The d.c. is swollen-at-the-place-at-
which-is-attached-the-nerve-which-comes-from the tooth. And so on, for
the different further characterizations of the swelling which you get when
you add that the swelling is in this or that way connected with the tooth.
I do not object to calling these “relations”; nor do I think we should
object to saying that X only starts to have one of these relations to Y
when X starts to be swollen-in-consequence-of-etc. The trouble is, rather,
that a d.c.’s swelling which is further characterizable in this or that way—
for example, as caused by this or that—is all the same, of necessity, a
d.c.’s swelling. What this means is that if you say that a man’s tooth’s
starting to ache is his d.c.’s starting to swell, this swelling being further
characterizable as of kind K (where “K” refers to the tooth), then this
commits you to saying that a man’s tooth’s starting to ache is his d.c.’s
starting to swell. Again, if a man’s tooth’s aching is his d.c.’s being swollen,
this swollen state being further characterizable as of kind K, then a man’s
tooth’s aching is his d.c.’s being swollen. If a man’s table is an old packing
case or a packing case from Bloomingdale’s, then all the same it is a packing
case.
But the statement “A man’s tooth’s aching is his d.c.’s being swollen”
is (Ua), the very thing we started with and hoped to improve on by re¬
stating the Identity Thesis. So this move was no help at all; we are still
committed to saying that A’s F-ing is B’s G-ing, where neither F nor G is
relational.
What produced this conclusion was the following: I asked what event
is supposed to take place at t, with which the tooth’s starting to ache is to
232 2. STRUCTURE OF SCIENCE

be identified. And I said that we should imagine that what happens (rele¬
vant to the tooth’s aching) at t is that the d.c. starts to swell; since the
required event must be relational, I then went on to give further charac¬
terizations of this starting to swell such that under those descriptions the
event was relational. In other words, I picked out as what happens in the
brain a nonrelational event, the d.c.’s swelling, and then proceeded to con¬
struct a relational event from it by further characterizing it. But then of
course the conclusion would follow: A further characterized X is still an X.
If we have the patience to see this point through to the bitter end, we
could try to imagine that the tooth’s starting to ache at t is just the d.c.’s
starting to have R to the tooth—there being no nonrelational event at t
of which it would be true to say that the starting to ache is that event
taking place. But other things would have to be added too. For what is
supposed to happen at (as it might be) t-plus-five-minutes when the tooth
stops aching? The tooth need not be doing anything relevant to the ceasing-
to-ache at t-plus-five; if the messages take a long time to reach the d.c.,
the tooth might have stopped sending messages long before t-plus-five.
Moreover, toothaches can pulsate, they can wax and wane; we should have
to imagine that this “purely” relational state of the d.c. and tooth can do
so too.
We can invent odd relations—I see no good ground for refusing to call
them “relations” which do meet these conditions. Suppose that John and
Henry stand in front of a mirror, and Henry smiles at them both for a full
five minutes and then stops—say he starts to smile at t-minus-fifteen and
stops at t-minus-ten. Then at t precisely John and Henry will acquire the
following relation to each other: were jointly being smiled on in a mirror
by Henry exactly fifteen minutes ago. They continue to have this relation to
each other until t-plus-five, and then cease to have it. Their acquiring this
relation to each other and then ceasing to have it is not a matter of any
contemporaneous activities of either of them; we might imagine they fainted
as Henry ceased to smile and have been lying there unchanged ever since.
Moreover, we can even give a sense to talk of their relation waxing and
waning between t and t-plus-five, viz. that Henry’s smile grew broader and
then shrank to a smirk.
In fact I am inclined to think that this is precisely the sort of relation R
would have to be: the d.c.’s having R to the tooth will have to consist in
the tooth’s having done something (sent a message) in the past; variations
in the relational state of the d.c. will have to consist in variations in what the
tooth was doing (variations in the messages) in the past.
But then on this construal of it, the Identity Thesis would commit us to
saying that a present tooth’s aching consists in a past activity of a tooth;
and that present variations in the ache consist in past variations in the
activity of a tooth. Not just that they are causally explainable by these
things in the past, but that they are these things in the past. Could there
JUDITH JARVIS THOMSON 233

conceivably be ground for saying such a queer thing—other than to save


a philosophical point?
For these relational states and events would be of no interest to a phys¬
iologist: They are not needed for the explaining of anything which happens
later. No one need appeal to them as causes of anything or as leaving traces
of any kind. Anything they can explain can more simply and directly be
explained by reference to the character of the messages sent to the d.c.
by the tooth, and the fact that it takes the messages thus and such amount
of time to get to the d.c.
I conclude, then, that nothing is to be gained by restating the Identity
Thesis in the form (Wi) “A man’s tooth’s aching is his d.c.’s having rela¬
tion R to his tooth.”
5. Some of Professor Smart’s remarks suggest a reply that he might
wish to make. I have in mind his discussion of the location of pains:

There seems to be no reason why there should not be some cerebral mechanism
which associates the pains that come from point A of the body with the cerebral
mechanism involved in, for example, perception and kinaesthesis of point A.
There may just be a tendency to move point A, to direct our eyes toward A, and
so on. To say that the pain is “in” point A is just to give expression to these
tendencies. Indeed, for these who have acquired language, one of these tend¬
encies may just be to say to oneself “in my thumb,” or whatever it may be.
The pain may “touch off” the words.

And again, in a footnote:

There may simply be a neural mechanism which constitutes a causal inter-con¬


nection between the brain process which is the sensation and the brain process
or state which explains the set of tendencies to look at, touch, or mention a cer¬
tain part of the body.12

If I have not misunderstood these remarks, they come to the following:


What one feels (or has) is a pain (ache, itch, . . .); the having of the
pain is a brain process. Then there is the quite different matter, the “local¬
izing” of the pain. The pain’s being in my front teeth is a tendency I have
—to point to my front teeth, to nurse them, to complain about them, to
say “I have a pain in my front teeth.” This tendency is explainable by
reference to a brain process or brain state—which is different from the first
brain process, which was the having of the pain. Nevertheless this second
brain process or brain state is causally connected with the first.
It seems to me just barely possible then—if this is Smart’s view—that
he would say I have simply been discussing the wrong example throughout.
There is the mental state, having an ache. This mental state is identical
with a certain brain state. But the ache’s being a toothache, its being an
ache in the upper front teeth, that is another matter entirely, and not
itself even a mental state but rather a tendency to behave in certain ways.
One’s tooth’s aching is a combination of a mental state, having an ache,
and this extra thing, the tendency to behave. Now the locution “a man’s
234 2. STRUCTURE OF SCIENCE

having an ache” does not mention or refer to a tooth or indeed to any


particular part of the body. So then if we are to identify a man s having an
ache with something’s going on in the brain, that something need not be
a part-of-the-brain’s standing in a certain relation R to a tooth—or any
particular part of the body. If we suppose that there is a part of the brain
—call it the “ache chunk”—which swells contemporaneously with all
startings to ache, then we can say that a man’s having an ache is his ache
chunk’s being swollen, and the tooth need not be mentioned, even if in
fact it is his tooth which aches.
But surely the same difficulty arises. I am not my ache chunk, nor is it
me; so how can my having an ache be my ache chunk’s being swollen?
(This is how one might pass from located sensations, like aches, to the
nonlocated contents generally, in particular, to Smart’s own example, the
having of an after-image.)
And in any case, the basis for the reply is surely confused. It allows the
possibility of a man who has aches, pains, itches, and the like, but whose
aches, etc., have no location in any part or parts of his body, his brain
being defective in that the brain processes which are the achings, etc., never
cause the second kind of brain process, the “locating” kind. And this,
alas, seems absurd.

NOTES
1. I am grateful to Jaegwon Kim for helpful criticism.
2. Reprinted in The Philosophy of Mind, ed. V. C. Chappell. Prentice-Hall, Engle¬
wood Cliffs, N. J., 1962.
3. Op. cit., p. 168.
4. I say “supposed to be” because it is highly dubious whether any scientific
simplification really is achieved by this move. Cf. Richard B. Brandt, “Doubts about
the Identity Theory,” in Dimensions of Mind, ed. Sidney Hook, and the detailed
and very good discussion of this matter in Jaegwon Kim, “On the Psycho-Physical
Identity Theory,” American Philosophical Quarterly, July 1966. The fundamental
points are to be found in the indispensable and by now classic chapter on “The
Reduction of Theories,” in Ernest Nagel’s The Structure of Science.
5. Op. cit., p. 162.
6. From Thomas Nagel, “Physicalism,” Philosophical Review, July 1965. If I
am not mistaken, this is the only example in the literature on the Identity Thesis
which is a contingent identity claim about events.
7. It might be argued that “X’s tooth aches” is not equivalent to “X has a tooth¬
ache” on the ground that a man could have a toothache though he has no teeth.
But (a) it really is not plain that lack of teeth makes more trouble for “My teeth
ache” than it does for “I have a toothache.” And (b), if it does, the discussion
throughout this paper can be explicitly restricted to men who have all their teeth.
Let “X is a toothful” mean “X is a man who has all his teeth.” Should not the
Identity Thesis be true of toothfuls as well as of the rest of us? But “X is a toothful
and X’s tooth aches” is surely equivalent to “X is a toothful and X has a toothache,”
and so a toothful’s tooth’s aching is his having a toothache.
8. Cf. section 4 below.
9. Example from Jerry Feder.
10. Example from James Thomson.
11. Cf. note 4, above.
12. J. J. C. Smart, Philosophy and Scientific Realism, p. 104.
EXTENSIVE MEASUREMENT WHEN
CONCATENATION IS RESTRICTED AND
MAXIMAL ELEMENTS MAY EXIST'
R. Duncan Luce2 and A. A. J. Marley3

1. INTRODUCTION

Theories of fundamental measurement begin with purported qualitative


laws about observable relations among certain entities, and using these laws
a numerical representation is constructed which reflects their formal struc¬
ture. The best known example, extensive measurement, arose from an
analysis of such familiar physical measures as mass and length. In measur¬
ing mass, the classical theory supposes that we can ascertain which of two
entities has the greater mass by comparing them, for example, on an equal-
arm pan balance in a vacuum—the one in the pan that drops having, by
definition, the greater mass. If the comparison of x with y shows that x has
more than or an amount equivalent to y of the attribute being measured,
we write xRy. Furthermore, the theory assumes that new entities can be
generated from old ones by placing two or more of the old ones on a single
pan. The latter combining operation is called “concatenation,” and the new
entity generated by concatenating x with y is denoted by xoy.
Various plausible and, to some degree, empirically true assumptions
about o, R, and their interconnections are stated (for a summary of them,
see Definition 2 and the discussion following it), from which it is shown
that we can assign to each entity x a real number <p(x) in such a way that
the function cp has the following three properties. First, it is order preserving
(monotonic),

xRy if and only if <p(x) §: <p(y);

second, it is additive over concatenations,

9?(xoy) = <p(x) + <p(y);

and third, any other assignment having the first two properties differs from
cp only by a positive multiplicative factor, i.e., the family of representations
forms a ratio scale.
Systematic discussions of these ideas have been given by, among others,
Campbell (1920, 1928), Nagel (1932), Suppes (1951), and Suppes and
Zinnes (1963).
Among the various idealizations embodied in the traditional theory of ex-
235
236 2. STRUCTURE OF SCIENCE

tensive measurement, one that is rarely, if ever, experimentally fulfilled,


even approximately, is the postulated freedom to concatenate any two
elements of the system to form a third that is also within the system. A
simple finite induction shows that if x is in the system and n is a positive
integer, then concatenations of n elements each of which is equivalent to x
is also in the system. Denoting this element by nx, then if an additive
representation cp exists, we have qp(nx) =nq?(x). Thus, in theory, q> is un¬
bounded. In measuring mass with a pan balance, one obviously cannot
concatenate freely without either damaging the balance or running out of
space.
It appears that most theorists believe that this practical limitation places
an unavoidable limit on the precision of measurement: specifically, that if
this limitation were incorporated into the theory it would either make it
impossible to construct a numerical representation or, if the construction
were possible, two representations would not necessarily be related by a
similarity transformation, i.e., the theory would not lead to a ratio scale
representation. As we shall see, neither alternative is true. Precise ratio
scales over finite sets of elements are possible provided that the usual
axioms are modified slightly. The basic fact is that it is unnecessary to
construct actual concatenations of indefinitely many replicas of a given
element in order to achieve precision; five subdivisions do just as well, as
has been accepted in practice.
The single major exception to the comments of the last paragraph
is Behrend (1956), whose work was only brought to our attention (by
Richard Robinson, whom we thank) after the present paper was com¬
pleted. Behrend gave a system much like that of Definition 2, and he proved
the same representation as given in Theorem 5. We cite the differences
between the two systems following Definition 2. Behrend’s proof is sub¬
stantially the same as ours. Nonetheless, we prove the result here for three
reasons: first, for completeness; second, because Behrend’s paper does not
seem to be widely known; and third, because our proof differs in some ways,
including the fact that it covers the case where indifference is not equality.
(A similar development for additive conjoint measurement [Luce and
Tukey, 1964] can be found in Luce, 1966.)
A second limitation of the traditional theory is its failure, even when
there is no restriction on concatenation, to take into account the possibility
that the system may have a maximal element. The best known example is
velocity: according to the theory of relativity, no velocity may exceed
that of light. This is true in spite of the fact that, in principle at least, any
two velocities may be concatenated (superimposed) to form a new one; it
simply means that they summate in a particular way so that the resultant
velocity remains less than or equal to that of light. With this example in
mind, we also wish to modify the axioms so as to admit the possibility that
maximal elements may exist.
R. DUNCAN LUCE and A. A. J. MARLEY 237

2. THE AXIOMS

The axiom system given in Definition 2 is a modification of Suppes’


(1951) and Behrend’s (1956) systems; the exact nature of the modifica¬
tion is pointed out following the formal statement of the system.
Definition 1. Let Abe a non-empty set, B a non-empty subset of A x A, R
a binary relation on A, and c a binary function from B into A. An element
acA is maximal relative to R and o if, for all xeA, aRx, and there is some
xeA such that (a,x)eB.
For brevity,we sometimes refer to an element simply as maximal without
specifying that it is relative to R and o.
Definition 2. Let A be a non-empty set, B a non-empty subset of A X A,
R a binary relation on A, and o a binary function from B into A. The
quadruple <A,B,R,o> is called an extensive system if, for all x,y,zeA,
the following five axioms hold:

1. R weakly orders A.
2. If (x,y)eB and (xoy,z)eB, then (y,z)eB, (x,yoz)eB, and (xoy)ozRxo-
(yoz).
3. If (x,z)eB and xRy, then (z,y)eB and xozRzoy.
4. If not xRy, then there exists an element y-xeA such that (x,y-x)eB,
yRxo(y-x), and xo(y-x)Ry.
5. Let n be a positive integer. For n=l, define lx = x. For n > 1, if
(n-l)x is defined and ((n-1 )x,x)eB, then define nx=(n-l)xox. For
all non-maximal yeA and all xeA, the set
(n|n is a positive integer and yRnx)
is finite.

The major changes introduced into Suppes’ system are these:


(i) R is assumed to be a weak order, not just transitive; Suppes’ proof
that R is connected depends on properties of o that we no longer possess,
and so we are forced to add that property explicitly.
(ii) Suppes’ Axiom 2 says, in essence, that B = A x A. We have weakened
it considerably by requiring the properties specified in Axioms 2-4. Among
other things, it is shown in Lemma 1 that if x and y can be concatenated,
then so can y and x and so can any pair of elements that, under R, are
dominated by x and y. Both of these conditions are likely to be satisfied
by any empirically interesting interpretation of the system.
(iii) Axioms 2-4 are the same as Suppes’ Axioms 3-5 provided only that
the relevant concatenations are possible.
(iv) Suppes’ system includes between our Axioms 4 and 5 the axiom that
for all x,yeA, not xRxoy. As we show in Theorem 1, the somewhat weaker
statement xoyRx follows from the other axioms, and in Theorem 2 we show
238 2. STRUCTURE OF SCIENCE

that Suppes’ axiom is equivalent to the assumption that there are no


maximal elements.
(v) With unrestricted concatenation, the Archimedean Axiom 5 can be,
and usually is, stated as follows: if yRx then there exists a positive integer
n such that nxRy. In the presence of the other axioms and with unrestricted
concatenations (i.e. B = Ax A), it is clear that our formulation is equivilent
to the usual one. When concatenation is restricted, the usual formulation
is meaningless and so something like our Axiom 5 is needed. Using velocity
as a guide, we have assumed that the Archimedean property holds only
for non-maximal y. Of course, if B is finite, as it will be in practice, Axiom
5 is satisfied trivially.
The relation to Behrend’s (1956) system is as follows.
(i) Let xly if xRy and yRx and xPy if xRy and not yRx. Behrend as¬
sumed I to be equality, and he used two of its usual properties, specifically,
substitutability and the fact that it is an equivalence relation. For this rea¬
son, he needed only to assume that R is connected; however, with I dif¬
ferent from equality we must assume R is a weak ordering. In particular,
his proof that P is transitive is no longer valid.
(ii) Behrend stated Axiom 2 in terms of I with the existence of the
concatentations on the right implying those on the left, rather than the other
way round.
(iii) Instead of Axiom 3, Behrend assumed: if (x,z)eB, then xly if and
only if (y,z)eB and xozlyoz; and if (z,x)eB, then xly if and only if (z,y)eB
and zoxlzoy. He showed that the analogous statements hold for P and that
I is commutative.
(iv) Axiom 4 is unchanged.
(v) As in Suppes’ system, the property xoyPx is assumed.
(vi) His Archimedean condition is essentially as we have given it.
The following simple example establishes that these changes are sig¬
nificant in the sense that a system with very few elements can fulfill the
axioms, an additive representation exists, and it is a ratio scale:

A = {a,b,c,d,e},
B = {(e,e), (d,e), (e,d), (c,e), (e,c)},
o:eoe = c, doe = eod — b, coe = eoc = a,
R:aIbPcIdPe,

where P and I have their usual meanings and R includes all implications
that follow from transitivity. It is routine to see that the axioms are fulfilled,
that if satisfies

<p(a) = 9(b) = 3<p(e), <p(c) = <p(d) = 2cp(e),

where cp(e) > o, it is an extensive (additive over o) representation, and


that any extensive representation is of this form.
R. DUNCAN LUCE and A. A. J. MARLEY 239

3. A PROPERTY OF MAXIMAL ELEMENTS

Throughout this section the axioms of Defintion 2 are assumed to hold,


and P and I are defined in terms of R in the usual way.
Lemma 1. If (x,y)eB, then (y,x)eB; and if in addition xRu and yRV,
then (u,v)eB and xoyRuov.
Proof. By Axiom 1, xRx and so by Axiom 3, (y,x)eB. Since xRu, Axiom
3 yields (y,u)eB and xoyRyou. Similarly, (u,v)eB and youRuov, and so
by Axiom 1, xoyRuov. QED
Lemma 2. If (x,y)eB, then xoylyox.
Proof. By Lemma 1, (y,x)eB. Since, by Axiom 1, xRx, Axiom 3 yields
xoyRyox. Similarly, yRy yields yoxRxoy. QED
Lemma 3. If (x,y), (xoy,z)eB, then (xoy)ozlxo(yoz).
Proof. Immediate from Axiom 2 and Lemma 2. QED
Note that Lemmas 2 and 3 imply that the ordering and grouping of con¬
catenations is a mere matter of convenience, once we have assured their
existence.
Theorem 1. If x,ysA and (x,y)eB, then xoyRx.
Proof. Suppose that, contrary to the assertion, xPxoy. By Axiom 4 and
Lemma 3, there exists z = (x-xoy)eA such that xI(xoy)ozIxo(yoz). By
Lemma 1 and Axiom 3, xPxoyR(xoy)oyIxo2y, and so by a finite induc¬
tion, xPxony for all positive integers n. Putting these two observations to¬
gether, xo(yoz)IxPxony, and so by Axioms 1 and 3, yozRny. Axiom 5
implies, therefore, that yoz is maximal, and in particular yozRx. Thus, by
Lemma 1 and Axiom 3, xIxo(yoz)Rxox, and so by another finite induc¬
tion and Axiom 5 we conclude that x is also maximal. Thus, xRy. Suppose
that xPy, then since xR(x-y), Axioms 3 and 4 imply that yo(x-y)
IxPxoyR(x-y)oy, which is impossible by Lemma 2 and Axiom 1. So ylx,
and thus xPxox. Since x is maximal and xPxoy, Axioms 3 and 4 yield
xoxR(xoy)o(x-xoy)Ix, which contradicts xPxox. QED
Lemma 4. If a,beA are both maximal, then alb.
Proof. Trivial. QED
Theorem 2. If a,xeA and (a,x)eB, then a is maximal relative to R and
o if and only if alaox.
Proof. If a is maximal, then by Defintion 1, aRaox. By Theorem 1, aoxRa;
hence, alaox.
Conversely, suppose that alaox and that a is not maximal. Since (a,x)eB,
Lemma 1 implies that (aox,x)eB and so by Axiom 3 and Lemma 3,
alaoxl(aox) oxIao2x. By induction, for every positive integer n, nx exists
and alaonx. Since, by assumption, a is not maximal, Axiom 5 implies
that, for some n, nxPa. So, by Axiom 3, alaonxRaoa. Coupled with
Theorem 1, this means alaoa. A finite induction then shows that for every
240 2. STRUCTURE OF SCIENCE

positive integer m, ma exists and alma. As this is impossible by Axiom 5,


it follows that a must be maximal. QED
Corollary. In an extensive system, the assumption that there is no maxi¬
mal element relative to R and o is equivalent to the assertion that for all
x,yeA such that (x,y)eB, then not xRxoy.
Proof. Theorems 1 and 2. QED

4. PRELIMINARY RESULTS WHEN NO MAXIMAL


ELEMENT EXISTS

Throughout this and the next section we assume the first four Axioms of
Definition 2 and the property (stated in the Corollary to Theorem 2) that
not xRxoy. Axiom 5 is not used again until Section 6, and then only once.
Lemma 5. If (x,u), (y,u)eB and xouRyou, then xRy.
Proof. Suppose, on the contrary, not xRy, then by Axiom 4, ylxo(y-x).
Thus, xouRyouIxo(y-x)ouI(xou)o(y-x), which contradicts the Corollary
to Theorem 2. QED
Lemma 6. If (x,y)eB, xPu, and yPv, then (x-u,y-v)eB and (x-u)o(y-v)
I(xoy-uov).
Proof. By the definitions of x-u and y-v and Theorem 1, xR(x-u) and
yR(y-v), and so by Lemma 1 (x-u,y-v)eB. By Lemmas 1, 2, and 3,

xoyluo (x-u) oyl (x-u) o (uoy ),


uoyluovo (y-v) I (y-v) o (uov).

So, by Lemma 1,

xoyl (x-u) o (y-v) o (uov).

But, by definition,

xoy I ( xoy-uov) o (uov),

and the result follows by Axiom 1 and Lemma 5. QED


Corollary. If (x,y)eB and xPu, then (x-u)I(xoy-uoy).
Proof. From the proof of Lemma 6,

(x-u)o(uoy)I xoyl (xoy-uoy)o(uoy),

and the result follows by Lemma 5. QED


Lemma 7. If (mx,nx)eB, then (m + n)x is defined and (m + n)xImxonx.
Proof. By induction and using Lemmas 2 and 3,

mxonxlmxo [(n-1 )xox]I(m+1 )xo(n-l)x .... I(m + n)x. QED

Lemma 8. There exists eeA such that (e,e)eB.


Proof. Since B is non-empty, there exists (x,y)eB. Either xRy or yRx.
R. DUNCAN LUCE and A. A. J. MARLEY 241

If the former, then it with yRy implies, by Lemma 1, that (y,y)eB. If the
latter, (x,x)eB. QED
Corollary. If eRx and eRy, then (x,y)eB.
Lemma 9. If eRx, eRy, eRz, xoyPe, and yoxPe, then xo(yoz-e)I-
(xoy-e)oz.
Proof. Since eoeRxoyI(xoy-e)oe, Lemma 5 yields eR(xoy-e). Similarly,
eR(yoz-e). Hence by the Corollary to Lemma 8, the asserted concatenations
exist. Let u=e-y. If uRz, then by Lemma 1 we obtain eluoyRzoy, con¬
trary to assumption; so zPu. By the definition of z-u, xozIxo(z-u)ou,
and so by the definition of xoz-u, (xoz-u)Ixo(z-u). From the Corollary
to Lemma 6, (z-u)I(yoz-uoy)I(yoz-e); hence by Lemma 1, (xoz-u)
Ixo(yoz-e). In a similar manner, (xoz-u)I(xoy-e)oz, and the result
follows by Axiom 1. QED
Lemma 10. If eRx, eRy, eRz, xoyPe, and eRyoz, then (x,yoz)eB,
xoyozPe, and (xoy-e)ozl(xoyoz-e).
Proof. From eRyoz and eRx, the Corollary to Lemma 8 shows that
(x,yoz)eB. Moreover, if eRxoyoz, then Theorem 1 yields eRxoy, contrary
to assumption. Finally

(xoy-e) ozoel (xoy-e) oeoz


Ixoyoz
I(xoyoz-e)oe,

and the result follows by Lemma 5. QED

5. CONSTRUCTION OF AN ORDINARY EXTENSIVE SYSTEM


Definition 3. Suppose that a = <A,B,R,o>/s an extensive system and
let eeA be a fixed element for which (e,e)eB {see Lemma 8). The system
ae = <Ae,Re,*> is defined by:

Ae = {(m,x)|m is a non-negative integer, xeA, and eRx}.


Re: (m,x)Re(n,y) if either m > n or m = n and xRy.
, \ / \ f(m + n,xoy) if eRxoy
*: (m,x)*(n,y) =)
((m + n +1,xoy-e) if xoyPe.

Note that * is well defined since, by the Corollary to Lemma 8, eRx and
eRy imply (x,y) eB; and if xoyPe then eR(xoy-e) since the converse,
(xoy-e)Pe, leads to the contradiction

xoyl (xoy-e) oePeoeRxoy.

It is clear that * is commutative.


Theorem 3. If Axioms 1-4 hold and if there is no element that is maximal
relative to R and o, then the system a(. satisfies Suppes’ axioms for an
extensive system.
242 2. STRUCTURE OF SCIENCE

Proof. 1. Re is obviously a weak order.


2. * is obviously a function from AexAe into Ae.
3. To show the associativity of *, observe that by definition,

[m+n + p,xoyoz] if eRxoy and eRxoyoz


[m+n-f-p +1,( xoy-e )oz] if xoyPe and eR( xoy-e )oz
[(m,x)*(n,y)]*(p,z) if eRxoy and xoyozPe
[m + n + p +1 ,xoyoz-e]
[m+n+p + 2, (xoy-e) oz-e] if xoyPe and (xoy-e)ozPe

and
[m + n + p,xoyoz] if eRyoz and eRxoyoz
[m+n + p + l,xo(yoz-e)] if yozPe and eRxo(yoz-e)
(m,x)*[(n,y)*(p,z)] if eRyoz and xoyozPe
[m+n + p +1 ,xoyoz-e]
[m+n+p + 2,xo (yoz-e) -e] if yozPe and xo( yoz-e )Pe

There are four cases:


(i) If eRxoy and eRxoyoz, then, by the Corollary to Theorem 2, eRyoz,
and so the first line holds in each case and they are identical.
(ii) If xoyPe and eR(xoy-e)oz, then either yozPe or eRyoz. If the
former, eR (xoy-e )oz together with Lemma 9 yields eRxo (yoz-e), and so
the second line holds in each case and they are equivalent. If the latter,
Lemma 10 yields xoyozPe. So the second line of the first expression and
the third line of the second expression hold and, using Lemma 10, these
expressions are equivalent.
(til) If eRxoy and xoyozPe, then either eRyoz or yozPe. If the former,
the third line holds in each case and they are identical. If the latter, we
show eRxo (yoz-e) in which case the second line of the second expression
holds and, using Lemma 10, it is equivalent to the third line of the first
expression. Suppose that xo(yoz-e )Pe, then xoyozIxo(yoz-e)oePeoeR
(xoy)oz, a contradiction.
(iv) If xoyPe and (xoy-e)ozPe, then yozPe. For suppose, on the con¬
trary, that eRyoz, then with eRx we obtain the contradiction eoeRxoyozI
(xoy-e )oeozPeoe. Lemma 9 then yields xo( yoz-e) I (xoy-e )ozPe, and so
the fourth line of the second expression also holds and, using Lemma 9,
these expressions are equivalent.
4. Suppose that (m,x)R*(n,y), then we wish to show that (m,x)
*(p,z)Re(n,y)*(p,z). If m > n, then there are four cases:
(i) If eRxoz and eRyoz, then (m,x)*(p,z) = (m + p,xoz)Re(n+1 +p,
xoz)Re(n + p,yoz) = (n,y)*(p,z).
(ii) If eRxoz and yozPe, then xozR(yoz-e) since otherwise we obtain
yozP(yoz-e) oeRxozoe, whence, by Lemma 5, yPxoePe, contrary
to choice of y. Thus, (m,x)*(p,z) = (m + p,xoz)Re(n-|-1+p,
xoz)Re(n+l+p,yoz-e) = (n,y)*(p,z).
(iii) If xozPe and eRyoz, then
(m,x)*(p,z) = (m + p-|- l,xoz-e)Re(n-t-p + 2,xoz-e)Re(n-|-p,
yoz) = (n,y)*(p,z).
R. DUNCAN LUCE and A. A. J. MARLEY 243

(iv) If xozPe and yozPe, then


(m,x)*(p,z) = (m + p+I,yoz-e)Re(n + p + 2,xoz-e)Re(n + p +
l,yoz-e) = (n,y)*(p,z).
Alternatively, m = n and xRy, in which case there are again four cases:

(i) If eRxoz and eRyoz, then the result is immediate by Axiom 3.


(ii) The case where eRxoz and yozPe is impossible since Axiom 3 and
xRy imply the contradiction eRxozRyozPe.
(iii) If xozPe and eRyoz, then
(m,x)*(p,z) = (m + p+1,xoz-e )Ie(n +p+1,xoz-e )Re(n+p,
y°z) = (n,y) *(p,z).
(iv) If xozPe and yozPe, then
(m,x)*(p,z) = (m + p+l,xoz-e)Ie(n + p+l,xoz-e)Re(n+p +
l,yoz-e) = (n,y)*(p,z) because the supposition (yoz-e)P
(xoz-e) yields

yozl (yoz-e) oeP (xoz-e) oelxoz,

which, by Lemma 5, implies yPx, contrary to assumption.


5. Suppose that not (m,x)Re(n,y), i.e., (n,y)Pe(m,x), then we show
that some (p,z) exists for which (n,y)Ie(m,x)*(p,z). If n > m, then
either yPx, in which case (m,x)*(n-m,y-x)Ie(n,y); or xly, in which case
(m,x) * (n-m-l,e)Ie(m,xoe-e)Ie(n,y); or xPy, in which case
(m,x)*(n-m-l, eoy-x)Ie(n,eoy-e)Ie(n,y). If n = m and yPx, then
(m,x)*(0,y-x)Ie(n,y).
6. Next we show that not (m,x)Re(m,x)*(n,y).
By definition,

(m+n,xoy) if eRxoy
(m,x)*(n,y)
(m + n +1 ,xoy-e) if xoyPe.

If xoyPe, then m + n+l > m and we are done. If eRxoy, the same argu¬
ment holds if n > 0. If n = 0, then (m,x)*(0,y)Pe(m,x) because xoyPx
by the Corollary to Theorem 2.
7. Suppose that (m,x)Re(n,y). Choose any integer k such that kn > m,
then

(k+l)(n,y)Re(kn,y)Re(m,x). QED

6. IMBEDDING OF a/l IN Ae/Ie

Throughout this section we deal only with the systems that result by
treating as elements the equivalence classes under, respectively, the equiv¬
alence relations I and Ie. Letters from the begining of the alphabet will be
used to denote these classes and = replaces I and Ie; e denotes both the
244 2. STRUCTURE OF SCIENCE

element of Definition 3 and its equivalence class. We now invoke all five
axioms of Definition 2 plus the assumption of no maximal elements.
Definition 4. The subsystem a'e = <A'e, B'e, R'e,*'> of ae is defined by.

A'e = {(Na,ae) |aeA/I, Na = max{n|aPne}, ae = a-Nae},


B'e = {((Na,ae), (Nb,be)) |(a,b)eB/I},
R'e is the restriction of Re to A'e,
*' is the restriction of * to B'e.

Note that by Axiom 5, Na exists.


Lemma 11. eRae.
Proof. Suppose that aePe. Since (ae,Nae)eB/I, it follows from Lemma 1
that (e,Nae)eB/I, and so

a = NaeoaePNaeoe = (Na + l)e,

contrary to the choice of Na. QED


Theorem 4. If a is an extensive system with no element that is maximal
relative to R and o, then the subsystem a'e is isomorphic to a/1.
Proof. The mapping a <—» (Na,ae) is 1:1. For suppose Na = Nb = N and
ae = be, then a-Ne = b-Ne and so a = (a-Ne)oNe = (b-Ne)oNe = b. The
converse is obvious.
The mapping is order preserving. If aRb, then clearly Na^Nb. If
Na > Nb, then (Na,ae)R'e(Nb,be). If Na == Nb = N, then ae = (a-Ne)R
(b-Ne) = be, and again the result follows. The converse is similar.
The mapping preserves o. Suppose that (a,b)«B/I and c = aob. Since
aRNae and bRNbe, then by Lemma 7, aobRNaeoNbe = (Na + Nb)e. Thus,
Nc ^ Na + Nb. Let Nc - Na + Nb = k. By Lemma 6,
aeobe = [aob-(Na + Nb)e]P[Nce-(Na + Nb)e] = ke.
So, if eRaeobe, k = 0, i.e., Nc = Na + Nb and aeobe = ce. Otherwise,
aeobePe. Since eoeRaeobe, it follows that k ^ 1. But since [aob-
(Na + Nb)e]Pe, aobP(Na + Nb + l)e, and so k = 1, from which the result
follows.
Conversely, suppose that

f(Na + Nb,aeobe) if eRaeobe


(Nc,Ce) = (Na,ae)*(Nb,be) =
|(Na + Nb + l,aeobe-e) if aeObePe.

In the first case,

c = ceoNce
= aeobeo(Na + Nb)e,

and in the second

c - ceoNce
= (aeobe-c)o(Na + Nb + l)e
= (aeobe)o(Na + Nb)e.
R. DUNCAN LUCE and A. A. J. MARLEY 245

But, by Lemma 6,

(aeobe)o(Na + Nb)e = (a-Nae)o(b-Nae)o(Na + Nb)e


= [aob-(Na + Nb)e]o(Na + Nb)e
= aob. QED

7. REPRESENTATION AND UNIQUENESS THEOREMS

Theorem 5. If a = <A,B,R,o> is an extensive system with no element


that is maximal relative to R and o, then there exists a positive real-valued
function 9 on A such that

(i) xRy if and only if <p(x) ^ <p(y);


(ii) if (x,y)eB, <p(xoy) = <p(x) + ?(y).

Proof. Suppes (1951) proved the existence of such a function for any
system fulfilling his axioms, in particular for ae. The restriction of it to a'e
transformed isomorphically by Theorem 4 to a completes the proof. QED
Theorem 6. If 9 and ip are two functions fulfilling Theorem 5, then there
exists a constant a > o such that 9 = <x\p.
Proof. With no loss of generality, we may suppose that 9(e) = t^(e) = 1 and
then show 9 = ty. Suppose that 9(x)^(jj(x), then since xeoNxeIx, we see
immediately that <p(xe)y^ip(xe). But since in ae, 9(n,x) = n + 9(x), this
implies non-unique additive scales on ae, contrary to Suppes’ uniqueness
theorem. QED
Theorem 7. If < A,B,R,o > is an extensive system with no element that
is maximal relative to R and o, and if B is finite, then one, and so an infinity,
of the representations of Theorem 5 is into the positive integers.
Proof. Since B is finite, so is the set

C = (x|xeA and there exists yeA such that (x,y)eB).

Choose e to be the least element of C under the ordering R. By Lemma 1,


(e,e)tB; moreover, if aeA, then aRe since if ePa then, by Lemma 1, (a,a)eB,
which contradicts the choice of e. Thus, by Lemma 11 the elements of A'e
of Definition 4 must each be of the form (Nx,e), where xeA/I and Nx is a
nonnegative integer. Hence, by Theorems 4 and 5, any representation 9 has
the property that

9(x) = <p(Nx,e) = (Nx + l)9(e).

Choosing 9 so that 9(e) is a positive integer establishes the result. QED


Note that, in practice, representations into the rationals, rather than into
the integers, are usually used because it is rarely convenient to take as the
unit the smallest element of C.
Theorem 8. Suppose that < A,B,R,o > is an extensive system with at least
246 2. STRUCTURE OF SCIENCE

one element a that is maximal relative to R and o. Let A' be A with all
maximal elements deleted, R' be the restriction of R to A, B = {(x,y)|
x,y«A', (x,y)eB,xoyeA'}, and o' be the restriction of o to B'. Then if A' and
B' are both non-empty, < A',B',R',o' > is an extensive system with no
maximal element.
Proof. We need only check the axioms of Def. 2, and it is clear that they
are all satisfied except, possibly, 4. It could fail if, in the original system,
al(y-x). In that case, however, yIxo(y-x)IxoaIa, which contradicts the
choice of y«A'. QED
Theorem 9. Suppose that < A,B,R,o > is an extensive system with at
least one element a that is maximal relative to R and o. Let < A',B',R',o' >
be defined as in Theorem 8, suppose that A' and B' are non-empty, and let
C be any positive real number.
1. If there exist u,veA' such that (u,v)eB and uovRa, then there exists a
positive real-valued function <p on A such that for all x,yeA,

i) xRy if and only if <?(x) — <p(y),


ii) <p(x) = C if xla,
iii) if (x,y)eB', then cp (xoy) = cp(x) + <p(y).

2. If for all x,y«A', (x,y)eB implies not xoyRa, then there exists a positive
real-valued function $ on A such that for all x,y«A

i) xRy if and only if <3>(x) ^(y),


ii) <h(x) = C if xla,
iii) if (x,y)eB,
$(x) + <f>(y)
$(xoy) =
1 + $(x)<I>(y)/C2

Proof. By Theorems 5 and 8, there exists an additive function cp over A'


1. Choose the unit of cp so that cp(u) + cp(v) < C and assign cp(x) = C
if xla. Parts ii and iii are clearly met. To show i it is sufficient to show that if
xeA', then 9(x)^<p(u) +<p(v). This is obvious if either uRx or vRx, so
we assume that xPu and xPv. By Axiom 4, there exists x-ueA' such that
(u,x-u) eB' and xluo(x-u). Thus,

uovIaPxIuo(x-u),

from which it follows that vR(x-u) since the contrary leads to a contra¬
diction by Axiom 3. So, by properties of cp,

cp (x) = cp (u) + cp(x-u)


^ 9(u) + cp(v).

2. Define

,, . fC tanh cp(x) if aPx


4,<x) = (c if alx
R. DUNCAN LUCE and A. A. J. MARLEY 247

Since tanh is strictly monotonic increasing and < 1, it is clear that parts
i and ii hold. Using elementary properties of tanh, part iii follows for those
x,yeA' for which (x,y)eB since, by assumption, (x,y)«B'. If (x,y)eB and
either x or yla, then by Theorem 2, xoyla, and by substituting $(a) = C in
both sides of part iii we see that it still holds. QED
Property 2, iii of Theorem 9 is the well-known formula for the relativistic
“addition” of velocities. It is obvious from the proof of part 1 that an
additive representation also exists if one is willing to assign oo to a maximal
element (in the case of velocity, that of light) and extend + in the usual
way. Such a representation of velocity fails, however, to have the property
that velocity equals the distance traversed divided by the time it takes. Evi¬
dently, physicists have preferred to retain the latter derived property and
to sacrifice the simple additive representation of concatenation. Perhaps
non-additive representations should be kept in mind in other sciences, even
when additive ones exist.
In the same vein, it is important to recognize that the representation given
in part 2 of Theorem 9 is by no means the only one possible—it is simply
the one that has arisen in the theory of relativity. Specifically, let f be any
monotonic increasing function that maps the positive reals onto the open
interval (0,C) with the property that there exists a function F of two
variables such that f(x + y) = F[f(x),f(y)]. Then

ff[9 (x) ] if aPx


*(x)
|C if alx

has properties 2,i and 2,ii of Theorem 8 and property 2,iii is replaced by:

<b(xoy) = F[<F(x),0(y)].

Many results on functional equations of the type f(x + y) = F[f(x),f(y)]


are given in Section 2.2 of Aczel (1966).

8. FUNDAMENTAL MEASUREMENT OF TIME DURATION

Although the theory of extensive measurement is widely accepted as a


suitable mathematical framework for the fundamental measurement of
some basic physical quantities, in particular mass and length, its role in
justifying other measures, such as velocity and time, has seemed somewhat
less secure to some authors. Of the two, the former has been the less vexing
since if both length and time can be measured fundamentally, then velocity
can be handled as a derived measure. In fact, as we have seen in Theorem 8,
it can also be treated as a fundamental quantity provided that a non-addi¬
tive representation is accepted.
Time is rarely discussed in detail in connection with presentations of
fundamental measurement schemes, and some authors seem to believe that
248 2. STRUCTURE OF SCIENCE

it cannot be handled by extensive measurement methods. In fact, however,


Campbell (pp. 550-553 of the 1957 edition) outlined a suitable interpreta¬
tion for measuring time durations, but this seems to have been overlooked,
perhaps because his exposition is a bit opaque. It may, therefore, be worth
restating it here. The entities of A are (the periods of) a family of pendu¬
lums and their concatenations as defined below. The ordering R is deter¬
mined as follows: if pendulums x and y are started at exactly the same
time (say by placing them side by side and releasing them together by
dropping a supporting rod) and if x fails to complete its first period before
y completes its first period, then xRy. The concatenation xoy denotes any
pendulum with the following property: if xoy and x are released at the
same time and if y is released exactly when x completes its first period (doing
this would be non-trivial in practice), then xoy and y will complete their
first periods at exactly the same time.
With these interpretations, it is no more difficult to be convinced that the
axioms of extensive measurement (with no maximal element) are fulfilled
than it is with the usual interpretations for mass and length. As there, some
care is needed in the choice of pendulums to be sure that Axiom 4 is satis¬
fied. Moreover, our modification of the axioms to permit restrictions on the
freedom to concatenate possesses practical advantages—temporal rather
than spatial—similar to those for mass and length. The construction of a
standard series based upon n concatenations of a duration with itself is
especially simple: one merely counts off n periods of the pendulum (of
course, one must verify that any two periods of an uninterrupted sequence
are equivalent in the sense that each matches the first period of some other
pendulum).

NOTES
1. We are endebted to Patrick Suppes for his critical comments on earlier drafts
and, in particular, for his suggestion that Theorem 7 be proved.
2. The first author worked on the problem both at the University of Pennsylvania
and at the Center for Advanced Study in the Behavioral Sciences. During the earlier
phase he received partial support from National Science Foundation Grant GB-1462
to the University of Pennsylvania, and during the later phase he was a National
Science Foundation Senior Postdoctoral Fellow.
3. The second author was a Fellow of the Miller Institute for Basic Research in
Science at the University of California, Berkeley, during the period of this work.

REFERENCES
Aczel, J. Lectures on junctional equations and their applications. New
York: Academic Press, 1966.
Behrend, F. A. “A contribution to the theory of magnitudes and the founda¬
tion of analysis,” Math. Zeit., 1956, 63, pp. 345-362.
Campbell, N. R. An account of the principles of measurement and calcula¬
tion. London: Longmans, Green, 1928.
R. DUNCAN LUCE and A. A. J. MARLEY 249

Campbell, N. R. Physics', the elements. Cambridge: Cambridge University


Press, 1920. Reprinted as Foundations of science: the philosophy of
theory and experiment. New York: Dover, 1957.
Luce, R. D., & Tukey, J. “Simultaneous conjoint measurement: a new type
of fundamental measurement,” J. math. Psychol., 1964,1, pp. 1-27.
Luce, R. D. “Two extensions of conjoint measurement,” J. math. Psychol.,
1966, 3, pp. 348-370.
Nagel, E. “Measurement,” Erkenntnis, 1932, 2, pp. 313-332. Reprinted in
A. Danto and S. Morgenbesser (Eds.) Philosophy of science. New York:
Meridian Books, 1960, pp. 121-140.
Suppes, P. “A set of independent axioms for extensive quantities,” Por-
tugaliae Mathematica, 1951, 10, pp. 163-172.
Suppes, P., & Zinnes, J. L. “Basic measurement theory.” In R. D. Luce,
R. R. Bush, & E. Galanter (Eds.) Handbook of mathematical psy¬
chology, Vol. 1. New York: Wiley, 1965, pp. 1-76.

• ' •
CAUSATION AND ACTION
Morton White

The question whether human actions are causally connected with choices
and motives is at least as old as the debate over free will and determinism.
Some philosophers have analyzed a voluntary action as one caused by a
choice; others have analyzed it as one which would not have been performed
if the agent had chosen not to perform it; and still other philosophers, with
no explicit metaphysical axe to grind, have asserted that an action may be
causally explained by citing a choice or a character trait. In all of these
cases it has been argued, and I think correctly, that a causal connection is
asserted, either by means of a “because” statement like “He did that because
he chose to do it,” or by means of a contrary-to-fact conditional like “He
would have done that if he had chosen to do it.” In its logical features the
first sort of statement is said to resemble “This match lit because it was
struck,” and the second to resemble “That match would have lit if it had
been struck.” It has also been correctly argued that we make causal state¬
ments when we cite a state or condition rather than an event in the ante¬
cedent of a “because” statement or in the antecedent of a contrary-to-fact
conditional. We may say “He did that because he was a generous man” or
“He would have done that if he had been a generous man,” just as we may
say “That match lit because it was dry.” Finally, all of these causal state¬
ments have been analyzed, also correctly in my opinion, so as to reveal their
logical connections with laws of nature. Those who accept the view that I
have been outlining may hold different opinions about the detailed features
of those logical connections, but such philosophers are nevertheless banded
together in a Humean community which seems to me to be on the right track.
Arrayed against that community there has always been a vocal group of
distinguished philosophers who implicitly or explicitly reject the view that
choices and actions are causally connected, as well as the view that actions
may be causally explained in a manner that requires appeal to laws. But in
recent times they have come to sing their song in a particularly alluring and
even sirenic manner by using certain techniques of linguistic philosophy. In
this paper, therefore, I wish to address myself to certain widely accepted
contemporary criticisms which are calculated to undermine the view out¬
lined in the first paragraph above.
250
MORTON WHITE 251

Those criticisms I shall distinguish as follows: (I) a criticism which


relies on what I think are the irrelevant features of certain first-person
sentences; (II) one which is based on an excessively narrow view of causal
explanation and of its relationship with laws or regularities; (III) one that
may be called cryptocausal or cryptodeterministic because it seems to rely
on the very notion of causal connection it purports to dispense with; and
(IV) a pair of criticisms that appeal unsuccessfully to a distinction between
analytic and synthetic statements or to an allied distinction.

The first criticism to which I address myself appears in an examination


made by J. L. Austin of certain things said by G. E. Moore. Moore main¬
tained in his Ethics that a voluntary past action is one that the agent was
not bound to have done, that is to say, one that the agent could have not
done, and (for reasons which we need not consider), Moore defines a
voluntary past action as one of which it may be said that if the agent had
chosen not to do it, he would not have done it. On the basis of other things
that Moore says, it is pretty clear that he thinks of statements of the form
“If the agent had chosen to do X, he would have done X” as causal in just
the sense in which “If that match had been struck, it would have lit” is
causal. And it is for holding such a view that Austin criticizes him.
Before examining Austin’s criticism, I want to distinguish certain issues.
One is whether the concept of voluntary action is correctly analyzed by
Moore in his Ethics. A second issue is whether Moore successfully argued
for the analysis given in his Ethics. A third is whether the contrary-to-fact
conditional which appears in the analysans of Moore’s analysis is causal in
character. Although Austin is concerned with all of these issues in his
examination of Moore, it is the third issue that I am primarily concerned
with in his paper. Even if Moore be wrong in his analysis of voluntary
action and in his defense of that analysis, it does not follow that the con-
trary-to-fact conditional he includes in the analysans of his analysis is not
causal. On that issue I am convinced that he is right and Austin wrong.
With this as preface, let me turn to the relevant parts of Austin’s critique
of Moore. Speaking of Moore’s illustration “I should have walked a mile
in twenty minutes this morning if I had chosen,” Austin first says that it
seems to be “an unusual, not to say queer, specimen of English,”1 and I
agree. I can also sympathize with Austin’s inclination to say that it means
the same as “If I had chosen to run a mile in twenty minutes this morning,
I should (jolly well) have done so.” I disagree, however, when Austin
construes Moore’s illustration as an assertion by the speaker of the speaker’s
strength of character, as an assertion of the fact that he puts his decisions
252 2. STRUCTURE OF SCIENCE

into execution. I also disagree with Austin when he says that he would
certainly not understand the sentence to mean that “if I had made a certain
choice, my making that choice would have caused me to do something.”2
To explain these disagreements I should first of all point out what I
do not think Austin denied, namely that a contrary-to-fact conditional
in the third person, like “If that powder had come in contact with a lit
match, it would have exploded,” makes no assertion about the character
of the powder, like “That powder is dry,” but it does imply that there are
features of the powder which, with the feature expressed by “That powder
came into contact with a lit match,” make up a conjunction that is con¬
nected by law with the feature expressed by “That powder exploded.”3
One of those features may, of course, be the dryness of the powder, but this
is compatible with the contention that the dryness of the powder is not
asserted by the contrary-to-fact conditional in question. We may now go
further and say that the sentence “If that powder had come in contact
with a lit match, it would (jolly well) have exploded” is just as causal as
the same sentence without the “jolly well,” and that even the sentence
with the “jolly well” does not assert that the powder was dry. We may now
go still further and say that when I make the first-person statement, “If
I had chosen to walk a mile in twenty minutes this morning I should (jolly
well) have done so,” there is still no more than the implication that there
are features of a certain kind. There is no more assertion of my having
strength of character here than there is of my having had a good sleep
the night before. In short, it seems to me that Austin, the past master of non-
causal first-person sentences, has not in this instance produced one that
is not causal. Both it and its third-person analogue about the powder fail
to assert that an individual has a particular characteristic like strength
of character or dryness, and both may have “jolly well” added to them
for emphasis and remain causal in character.
I grant of course that there are other first-person conditional sentences
which Austin rightly describes as not being causal, but these other ones
do not count against Moore’s view. Consider, for example, what Austin
says about “I shall (do it) if I choose (to do it)” in an effort to show that
it is not causal. I think he is right when he contrasts it with “I shall ruin him
if I am extravagant” by pointing out that while the latter contains a causal
“if,” the former does not because it makes good sense to stress the “shall”
in the former but not in the latter.4 But this difference disappears when
we turn to the third-person counterparts of these sentences: “He will run
a mile in twenty minutes if he chooses” and “He (Smith) will ruin (Jones)
if he (Smith) is extravagant,” and therefore Austin can no longer use his
argument about stress in order to distinguish these two sentences. I am
sure that Moore would have used third-person illustrations throughout his
discussion of free will had he anticipated the irrelevant criticism that “I
MORTON WHITE 253

shall” in “I shall if I choose” is “not an assertion of fact but an expression


of intention, verging towards the giving of some variety of undertaking.”5
Obviously, “He will run a mile in twenty minutes if he chooses” is not an
expression of intention, and therefore what Austin says about “I shall if I
choose,” even if it be correct, cuts no ice against Moore’s point when
illustrated by the use of third-person sentences. After all, Moore engages
in a discussion of “ifs” and “cans” in order to analyze a feature of actions
said to be right and wrong, namely, their voluntariness; and judgments of
right and wrong and of the voluntariness of past actions are typically made
in third-person sentences of the form “He acted wrongly when he did so-
and-so” and “He acted voluntarily when he did so-and-so.” Similarly,
when a person is about to perform an act that we think is voluntary, we
may say “He will do X if he chooses to do X.” But the person of whom we
say “He will do X if he chooses to do X” need never utter the words “I
shall do AT if I choose to do X,” and therefore the allegedly noncausal
character of the latter sentence proves nothing about the former. We may
conclude that Austin has not shown in the arguments I have considered
that choices are not connected with actions by contrary-to-fact conditionals
which are causal.

II

I come now to an effort to show that actions may be explained without


the speaker asserting, implying, or relying on “because” statements (rather
than contrary-to-fact conditionals) as analyzed by advocates of the reg¬
ularity theory of explanation. One point of such philosophers as I have in
mind—chiefly, Mrs. Philippa Foot6—seems to be that statements like
“That man gave one quarter of his salary to the United Fund because he
is a generous man” provides a counter example for the regularity theory
of explanation simply because it is false that all generous men give a quarter
of their salaries to the United Fund. But this point rests on the supposition
that advocates of the regularity theory must adopt what I think is a highly
vulnerable version of their theory, and once this version is abandoned
for a more tenable version of the regularity theory, the argument seems to
collapse. The more tenable version I have in mind is one which says that
the statement, “That match lit because it was struck,” is true if and only
if there are features of the match which, together with its being struck,
make up a conjunction that is connected lawfully with its lighting.7 If one
adopts this version of the theory it becomes clear that one may say “That
match lit because it was struck” without implying the falsehood that all
matches which are struck, light, and without abandoning the regularity
theory. Something analogous is true of “This match lit because it was dry.”
254 2. STRUCTURE OF SCIENCE

And once we see that we can assert “This match lit because it was dry”
without implying that all dry matches light, we can also see that we can
assert “That man gave one quarter of his salary to the United Fund because
he is a generous man” without implying, to use Mrs. Foot’s words, “that
where the character trait can be predicated the action will invariably
follow.”8 If what I have called the less vulnerable version of the regularity
analysis of singular “because” ‘ statements is adopted, we may regard
statements like “That man gave one quarter of his salary to the United
Fund because he is generous” as causal and logically linked with laws,
only the link is more complex than that envisioned by what may be called
a simpleminded advocate of the regularity theory.
We need not pause very long to show that this view allows us to see
why singular “because” statements which refer to traits in their antecedents
and events in their consequents are perfectly respectable causal statements.
Moreover, we can see by inspecting ordinary language that it is false to
say that an event can happen only because of an event. An automobile
accident may have taken place because of the icy condition of the road
and, as we now see, this may be true without its following that where
iciness can be predicated of a road on which automobiles are traveling,
automobile accidents invariably follow.

Ill

We may now turn to another argument offered by Mrs. Foot. She


points out, I think correctly, that when we explain a man’s action by
saying that he was acting out of revenge on a given occasion, we do not
cite one of his character traits, and here, she seems to think, we have a clear
example of an explanation which is not causal or deterministic. But I do
not think that when Mrs. Foot analyzes it she shows that it is an illustration
in her favor. This is where what I have called “crypto-determinism”
seems to emerge.
It emerges in the course of Mrs. Foot’s attempt to show that a statement
like “His motive in doing that was revenge” (which I take to be equivalent to
“He did that out of revenge”) is not causal in the sense favored by advocates
of the regularity theory of explanation. “Assigning a motive to an action,”
she says, “is not bringing it under any law; it is rather saying something
about the kind of action it was, the direction in which it was tending, or
what it was done as.”9 In other words, Mrs. Foot maintains that sometimes
when we seek an explanation of an action by asking for its motives we are
asking a question which is properly answered by our being told something
about what kind of an action it was. In that case, she seems to say, the
answer to our question will not be a “because” statement which may be
analyzed as we have analyzed “That match lit because it was struck” or
MORTON WHITE 255

“That match lit because it was dry.” Mrs. Foot holds that in some cases
“finding the motive will be . . . described as finding what was being done—
finding, for instance, that someone was taking revenge”10in say, striking
a person. Then, if the question is “Why did he strike her?” the answer may
be “In striking her he was taking revenge on her.” How, now, do we find
out whether he was taking revenge on her? According to Mrs. Foot, “We
should take it that a man’s motive was revenge if we discovered that he was
intentionally harming someone and that his doing so was conditional on
his believing that that person had injured him.”11 But if this is how we
discover that he was taking revenge on her, then the statement “He was
taking revenge on her” is either logically equivalent to or supported by
a causal statement, namely, “He intentionally harmed her because he
believed that she had injured him.” And this is what I have called crypto¬
determinism, for I do not see any plausible interpretation of “His harming
her was conditional on his believing that she had injured him” which does
not show it to be causal in precisely the sense in which “That match lit
because it was struck” and “That match lit because it was dry” are. In short,
the statement in which we explain the action by citing a motive is either
causal or supported by a causal statement. If this is so, then Mrs. Foot has
certainly not succeeded in excluding causal knowledge from our explanation
of an action by citing a motive. She tacitly admits that in certain cases
a person who explains an action by citing a motive must or may show that
the agent did something because he believed something. Why, then, should
we not conclude that the statement “In striking her he was taking revenge
on her” made in answer to the question “Why did he strike her?” is
equivalent to “He struck her because he believed that she had injured him,”
in which case an explanation which assigns a motive to an action is obviously
causal. It may not “bring the action under a law” in a simple way but it
implies the existence of a law just as “That match lit because it was
struck” and “That match lit because it was dry” do. How else were the
believing and the striking related if the latter was conditional on the
former?

IV

Finally I come to two arguments which unsuccessfully appeal to a


distinction between analytic and synthetic statements. First I consider
one offered by Mrs. Foot and then one which has been presented by
Professor Melden.
Mrs. Foot tries to show that intentions are not causally related to actions
by employing a distinction between analytic and synthetic statements.
Her illustration, with a slight alteration that does not affect the issue, is
“His motive for going to the station last night was to take a train to
256 2. STRUCTURE OF SCIENCE

London,” where his motive may be described as an intention. Mrs. Foot


says:

But where motives are intentions it is clear that they cannot be determining
causes; for intending to do X and being ready to take the steps thought necessary
to do X are connected not empirically but analytically. A man cannot be said
to have an intention unless he is reconciled to what he believes to be the in¬
termediate steps. We cannot speak as if the intention were something which
could be determined first, and “being ready to take the necessary steps” were a
second stage following on the first.12

What is being claimed here? The action is his going to the Oxford
station last night. His intention (and his motive) was to take a train to
London. So we may say “Jones went to the Oxford station last night because
Jones had the intention of taking a train to London.” Now even the most
hardened and confident user of that difficult term “analytic” cannot say
that this statement is analytic on the ground that intending to take a train
to London and being ready to take the steps thought necessary to take a
train to London are connected analytically. Presumably the statement which
Mrs. Foot believes is analytic is (A), “Because Jones intended last night
to take a train to London, Jones was ready last night to take the steps he
thought necessary to take a train to London.” But surely the analyticity
of (A) does not imply the analyticity of (B), “Because Jones intended
last night to take a train to London, Jones went to the Oxford station
last night.” To show the analyticity of (B), one would have to assert not
only that (A) is analytic but also assert the analyticity of (C), “Because
Jones was ready last night to take the steps he thought necessary to take
a train to London, he went to the Oxford station last night.” And surely
(C) is not analytic. It is logically possible that Jones was ready to take
the steps he thought necessary to take a train to London and yet did not
go to the Oxford station.
Once we point out that Mrs. Foot has made a fallacious inference from
the analyticity of (A) to the analyticity of (B), we have undercut the
contention that explaining Jones’ action last night by citing his intention
must be done in an analytic rather than a synthetic statment. The most that
she has shown is that his having the intention to take a train to London
and his being ready to take steps necessary to take a train to London are
analytically connected. However, we were not concerned to explain his
readiness to do something, but his doing something last night. Moreover,
as we have seen, even though the intention in question and the readiness in
question be connected analytically, the intention and the action are con¬
nected synthetically if the statement of readiness to do something of a
certain kind does not analytically imply a statement that an action of that
kind has been performed.
Other philosophers have used the concept of analyticity or something
close to it in an effort to refute what I have earlier called the Humean view
MORTON WHITE 257

of the connection between choices and actions. Ironically enough, some


of them appeal to Hume’s own more general views on causal connection in
order to undermine his more specific view of the connection between
choice and action, and between decision and action. The purpose of their
argument is to show that neither “He would have walked a mile in twenty
minutes yesterday if he had chosen to walk a mile in twenty minutes yester¬
day” nor “She walked a mile in twenty minutes yesterday because she chose
to walk a mile in twenty minutes yesterday” is causal because each of them
violates one of Hume’s conditions for an empirical causal statement. The
choosing, it is said by Professor Melden, “must be logically distinct from
the alleged effect—this surely is one lesson we can derive from a reading
of Hume’s discussion of causation. Yet nothing can be an act of volition
that is not logically connected with that which is willed—the act of willing
is intelligible only as the act of willing whatever it is that is willed.”13 And
elsewhere Melden says, “If I decide to do X, the decision is intelligible
only as the decision to do X. The reference to the doing is logically essential
to the very thought of the decision.”14
From this I conclude that Professor Melden thinks that both “He would
have walked a mile in twenty minutes yesterday if he had chosen to walk
a mile in twenty minutes yesterday” and “She walked a mile in twenty
minutes yesterday because she chose to walk a mile in twenty minutes
yesterday” are analytic, or that they have whatever property it is that he
wishes to ascribe to such statements when their antecedents and consequents
refer to things that are “logically connected.” Here I will not take the
defensible but conversation-stopping tack of saying that I do not under¬
stand what he means by “logically connected.” But I must ask whether
it is supposed that “He would have walked a mile in twenty minutes yester¬
day if he had chosen to walk a.mile in twenty minutes yesterday” is seen
to be true merely by examining the meanings of its component terms. And
I would ask the same question about “She walked a mile in twenty minutes
yesterday because she chose to walk a mile in twenty minutes yesterday.”
Does Professor Melden want to say, in the familiar jargon, that the denials
of these statements are self-contradictory or logically impossible? Does he
really want to say that choosing to do X logically necessitates doing XI
I can hardly believe this and therefore I am inclined to think that he has
gotten into his predicament by overlooking the following points.
Let us suppose that Melden is right in saying that we cannot understand
a sentence of the form “He decided to X" without understanding the
expression that is put for “A.” From this it does not follow that “Because
he decided to X, he Y’ed” is analytic or logically true. Perhaps an analogy
will help make the point. It might be said that there is a sense of the word
“understand” in which we cannot understand a sentence of the form “He
believes p” unless we understand the sentence in place of “p.” But even
if we should maintain this, we certainly would not say that “He believed
258 2. STRUCTURE OF SCIENCE

he would climb the Matterhorn the next day” logically implies “He climbed
the Matterhorn the next day.” William James seems to have thought
that on certain occasions “p” was the case because “p” had been believed
by him to be the case. Are we to say that “James climbed the Matterhorn
on January 1, 1891, because James believed that he would climb the
Matterhorn on January 1, 1891” is analytic? Surely James’s belief that
he would perform a certain action was not logically connected with, and
could have been causally connected with, his performing that action even
though we might say that the belief-sentence in question could not be
understood unless the nonbelief-sentence contained in it as a part was
understood.
I am inclined to think that the situation is similar in the case of choice-
sentences, which are, I might add in passing, intensional contexts as belief-
sentences are. “She chose to walk a mile in twenty minutes yesterday” may
be construed as meaning the same as “She chose for it to be the case
that she walked a mile in twenty minutes yesterday.” But surely this does
not logically imply “She walked a mile in twenty minutes yesterday.”
Choices are often frustrated. And that is why we cannot argue that the first
sentence cannot causally imply the second sentence by pointing out that
we cannot understand the first without understanding the second. Even
if it be true that we cannot understand the choice-sentence without under¬
standing the action-sentence, the choice-sentence may causally imply the
action-sentence.
If I am correct, Professor Melden has failed to see the difference between
the implications of saying that you cannot understand the logical con¬
junction “She washed her hair yesterday and she walked a mile in twenty
minutes yesterday” without understanding its second conjunct, and the
implications of saying that you cannot understand “She chose for it to be
the case that she walked a mile in twenty minutes yesterday” without
understanding “She walked a mile in twenty minutes yesterday.” In the
former case the fact that you cannot understand the longer sentence with¬
out understanding the shorter one may entitle you to say that the longer
logically implies the shorter but in the latter case this is not so. Perhaps this
also shows why one might well be leary about defining “analytic” in terms of
the notion of not being able to understand one expression without under¬
standing another. Those who are not leary about this might also ponder the
fact that you cannot understand “A or 5” without understanding “A.”
Does it follow that “If A or B, then A” is analytic? Obviously not. This
example shows, of course, that we need not cite intensional contexts in
order to refute the view under consideration. Intensional contexts form
only a subclass of those that do not behave as Melden might like them to
behave. Think also of “If not -A, then A.'”
In conclusion I should say that I have focused on the particular arguments
I have chosen to rebut, not only because they have been advanced by
MORTON WHITE 259

serious philosophers from whose work I have learned much, but also
because they seem to me to reflect styles of argument that are more popular
in contemporary philosophy than they are effective. They illustrate, I
believe, three unfortunate tendencies in recent philosophy: an excessive
reliance on the supposedly eye-opening features of first-person sentences,
an excessively narrow view of the regularity theory of causal explanation,
and a deplorable looseness in most philosophical talk about the analytic,
the synthetic, and allied notions.

NOTES
1. J. L. Austin, “Ifs and Cans,” Philosophical Papers (Oxford, 1961), p. 156.
2. Ibid., p. 157.
3. See Nelson Goodman, Fact, Fiction and Forecast (Indianapolis, Second Edi¬
tion, 1965), chapter 1.
4. Austin, p. 159.
5. Ibid., p. 161.
6. Philippa Foot, “Free Will as Involving Determinism,” Philosophical Review,
LXVI (1957), pp. 439-450.
7. See my Foundations of Historical Knowledge (New York, 1965), chapter III.
8. Foot, op. cit., p. 441.
9. Ibid., p. 447.
10. Ibid.
11. Ibid.
12. Ibid., p. 444.
13. A. I. Melden, Free Action (London, 1961), p. 53.
14. Ibid., p. 203.
SOME EMPIRICAL ASSUMPTIONS IN
MODERN PHILOSOPHY OF LANGUAGE1
Noam Chomsky

The classical empiricist theory of acquisition of knowledge is perhaps the


clearest such theory that has received expression, and also, no doubt, the
most influential. It is hardly necessary to document its influence—“domi¬
nance” might be a more accurate term—in linguistics and learning theory,
for example. I would like to consider here some of the ways in which
empiricist assumptions about acquisition of knowledge have appeared in
recent philosophy, specifically, in connection with the problem of language
acquisition. The question seems to me worth pursuing because of the pos¬
sibility of interpreting these assumptions as factual claims, claims which
are, furthermore, highly debatable—in my opinion, quite erroneous—al¬
though no argument or evidence is given in support of them. Perhaps they
are regarded as self-evident. My purpose here is not to demonstrate the
falsity of these assumptions, but rather to try to determine how they enter
into some recent and important thinking about the nature of language.
In its classical version, a narrow empiricist doctrine of acquisition of
knowledge, as expressed most clearly by Hume, maintains that the mind,
initially a blank tablet, receives impressions from the sense organs as a
“passive mirror” (I omit, in this narrow theory, the matter of secondary
impressions). Faded impressions—ideas—are associated with one another
in accordance with certain fixed principles, which Hume enumerates as the
principles of similarity, contiguity, and cause-effect relation, the latter
being based on a certain “animal instinct.” This proposal concerning the
nature of mind becomes an empirical hypothesis of real substance when
the various notions that appear in it are properly clarified. Thus properties
of the sense organs will determine the nature of “impressions,” and prop¬
erties of the mind will determine just what counts as contiguity, similarity,
and inductive evidence for establishment of causal relations. (Modern
versions might add primitive S-R connections, certain assumptions about
stimulus sampling, etc., but this is irrelevant here.) As an empirical hypoth¬
esis about the nature of mind, this empiricist view should be faced with
evidence, as Hume, in his own way, insisted. Evidence can be brought to
bear on this hypothesis at several levels. First, we might attempt to give
a careful characterization of some system of human knowledge and then
determine whether this system has the properties implied by the empiricist
260
NOAM CHOMSKY 261

assumptions about how knowledge must be acquired. At this level, the


question at issue is one of adequacy in principle of certain assumptions.
Second, assuming the answer to the first question to be affirmative, or at
least not obviously negative, we might go on to ask whether under empiri¬
cally given conditions of access to data, this system could have been ac¬
quired by the mechanisms and in the manner postulated. This is a ques¬
tion of feasibility. Analogous questions can be asked about animals and
their acquisition of knowledge and belief.
It is sometimes supposed that the empiricist assumptions constitute a
kind of null hypothesis, to be held until refuted, as the “simplest” or
“most natural” thinkable. This is a curious point of view. The issue of rela¬
tive “simplicity,” even if this notion can be given some content relevant
to choice among theories, can hardly be sensibly raised in connection with
theories so meager in confirming evidence and explanatory force as those
that have been proposed to account for learning and behavior. What is
more, it is difficult to see on what rational grounds an empiricist theory
can be shown to be “simpler” than, let us say, a pure reminiscence theory,
which might also be characterized in a quite definite way, and held, ir¬
relevantly, to be “simpler” in that it minimizes the role of learning. Surely,
it cannot be maintained that there are some physical conditions of plausi¬
bility that support the empiricist view. It would, in fact, be quite remark¬
able if the physical conditions under which the brain develops and its de¬
tailed and intricate organization had no effect on the way associations can
develop, as presupposed in the empiricist view, though it remains a logical
possibility that this is so, just as it remains a logical possibility that all
mental structures are developed through association. There would be no
difficulty in designing a brain model that worked in quite a different way.
Thus to adopt the empiricist position without argument or evidence would
be mere dogmatism, a priorism of a sort that is no better founded in the
study of mind than the study of any particular aspect of physiology, or in
any other intellectual pursuit.
It might be supposed that the modern investigation of learning departs
from the classical empirical approach in some fundamental way, avoiding
its limitations and inadequacies. This question is irrelevant to the specific
topic of this paper, but I think that anyone who explores the literature of
learning theory will discover that despite much increase in knowledge and
sharpening of technique, the theories that have been clearly formulated come
no nearer to providing an account of normal learning—language learning,
for example—than the classical approach. Either they remain within this
framework in essential respects, or else they have shifted attention to such
matters as control of behavior, establishment of behavioral repertoires or
habit hierarchies, or the setting up of S-R connections through conditioning,
and have thus largely abandoned the problem of accounting for acquisition
of knowledge. To say this is to make no criticism of this work, which must
262 2. STRUCTURE OF SCIENCE

be evaluated in terms of its own goals and interests, but simply to point
out that there is, for the moment, no more reason to give it the general des¬
ignation “theory of learning” than to give this designation to molecular
biology (which at least has the merit of occupying itself with fundamental
questions rather than with phenomena and processes which may very well
be quite marginal in normal learning or behavior).
Similarly, the commonly voiced argument that the basic questions are
begged in a nonempiricist account also seems to me entirely without sub¬
stance. How the mind reached its present state of structure and organiza¬
tion is, of course, a fair question, though it is one that can hardly be posed
in any sensible way until we have achieved some understanding of the
nature of that structure and organization. But the particular assumptions
of, let us say, a theory that attributes certain ideas and principles to the mind
as an innate property do not beg that question any more or less than the
conflicting assumptions of the empiricist theory of mind. Of no greater
substance is the argument that whatever innate structure there may be de¬
veloped through evolutionary mechanisms, so that the empiricist approach
is still basically correct, though on the phylogenetic level.2 For one thing,
even if the notion of “development through evolutionary mechanisms” can
be made moderately precise and the claims of “phylogenetic empiricism”
verified in terms of this now clear concept, the whole matter would have
nothing to do with the question of origin of knowledge or belief in the case
of a given organism. Furthermore, it is important to bear in mind that this
notion is, for the present, almost empty, and with it, the claims of “phylo¬
genetic empiricism”; it is, in fact, perfectly possible that the innate struc¬
ture of mind is determined by principles of organization, by physical condi¬
tions, even by physical laws that are now quite unknown, and that such
notions as “random mutation” and “natural selection” are as much a
cover for ignorance as the somewhat analogous notions of “trial and error,”
“conditioning,” “reinforcement,” and “association.” It is not that these
notions do not have a clear interpretation under certain restricted condi¬
tions; what is in question is their significance when extended, in a purely
metaphorical way, well beyond the restricted conditions in which they have
been studied in many careful and important investigations.
Finally, certain essentially terminological objections might be raised
against alternatives that have been proposed to an empiricist hypothesis.
For example, there is evidence that the most primitive interpretation of
visual phenomena by human infants makes use of the basic perceptual
constancies, but one might ask whether the phrase “innate knowledge of
the properties of objects in three-dimensional space” is one that should
be used in describing the presumably innate schematism that underlies
these abilities. In general, there is no refined or sufficiently elaborate
terminology available for dealing with the “infinite amount of knowledge
of which we are not always conscious,” of which Leibniz spoke, or with
NOAM CHOMSKY 263

the unconscious innate principles that determine the character of this


knowledge. Therefore, one must extend, sharpen, or modify familiar usage
when studying these problems. It is conceivable that this practice might
lead to conceptual confusion of some sort, but unless this can be demon¬
strated, the terminological issues need not delay the attempt to study the
innate structures and principles that serve as a precondition for experience
and the basis for acquisition of knowledge. In the particular case of lan¬
guage, it is clear that at a certain state of maturation and development a
child can properly be said to know his language (perfectly, by definition,
language having no objective existence apart from its mental representa¬
tion). We can then ask what role his experience played in determining the
precise character of what he knows, for the most part unconsciously, at
this stage. It is hard to see why we should withhold the term “innate
knowledge of language” or “innate knowledge of the nature of language”
from principles or systems of substantive elements that underlie and enter
into the knowledge that is acquired, and that might be shown, through
empirical investigation, to be independent of experience in their form and
content. But the terminological question, in any event, hardly seems worth
pursuing.
In short, it seems to me that the question of the basis for acquisition of
knowledge is an open, empirical question, to be settled by empirical investi¬
gation rather than by a priori argument or by pure conceptual analysis.
Specifically, the empiricist assumptions have no special status among the
many theories that might be proposed to account for the acquisition of
knowledge of language, or anything else.
Perhaps the clearest and most explicit development of what appears to
be a narrowly Humean theory of language acquisition in recent philosophy
is that of Quine, in the introductory chapters to his Word and Object.3 If
the Humean theory is roughly accurate, then a person’s knowledge of
language should be representable as a network of linguistic forms—let us
say, to first approximation, sentences—associated with one another and,
in part, associated to certain stimulus conditions. This formulation Quine
presents as, I take it, a factual assertion. Thus he states that our “theories”
—whether “deliberate,” as chemistry, or “second nature,” as “the im¬
memorial doctrine of ordinary enduring middle-sized objects”—can each
be characterized as “a fabric of sentences variously associated to one an¬
other and to nonverbal stimuli by the mechanism of conditioned response”
(p. 11). Hence the whole of our knowledge (our total “theory,” in this
sense) can be characterized in these terms.
One difficulty that arises in interpreting such passages as these has to do
with the relation between language and theory, where the latter term covers
also general common-sense knowledge and belief. Quine’s views about the
interpenetration of theory and language are well-known, but, even accept¬
ing them fully, one could not doubt that a person’s language and his
264 2. STRUCTURE OF SCIENCE

“theory” are distinct systems. The point is too obvious to press, but it is,
nevertheless, difficult to see how Quine distinguishes the two in his frame¬
work. In fact, throughout the discussion, he seems to use the terms inter¬
changeably. For example, in Chapter 1, he discusses the learning of
language in general terms, exemplifies it by an example from chemical
theory leading up to the statement just quoted, then seemingly describes
the “vast verbal structure” so constructed, the associative network that
constitutes one’s knowledge of sciences (“and indeed everything we ever
say about the world”), as both the “body of theory” that one accepts and
the language that one learns. Thus the discussion of how one constructs
and uses a total theory of this sort concludes with the following statement:

Beneath the uniformity that unites us in communication there is a chaotic


personal diversity of connections, and, for each of us, the connections continue
to evolve. No two of us learn our language alike, nor, in a sense, does any
finish learning it while he lives.

Since this comment merely summarizes the discussion of how the “single
connected fabric” constituting our total theory is acquired (the latter dis¬
cussion itself having been introduced to exemplify language-learning), it
seems that Quine must be proposing that a language, too, is “a fabric of
sentences variously associated to one another and to nonverbal stimuli by
the mechanism of conditioned response.” Other parts of this exposition
reinforce the conclusion that this is what is intended, as we shall see in a
moment. Nevertheless, interpretation of Quine’s remarks is made difficult
at points because of his tendency to use the terms “language” and “theory”
interchangeably, though obviously he must be presupposing a fundamental
difference between the two—he is, for example, surely not proposing that
two monolingual speakers of the same language cannot disagree on ques¬
tions of belief, that controversy over facts is necessarily as irrational as
an argument between a monolingual speaker of English and a monolingual
speaker of German.
Elsewhere, Quine states that he is considering a language as a “complex
of present dispositions to verbal behavior, in which speakers of the same
language have perforce come to resemble one another” (p. 27). Thus if
a language is a network of sentences associated to one another and to ex¬
ternal stimuli by the mechanism of conditioned response, then it follows
that a person’s disposition to verbal behavior can be characterized in terms
of such a network. This factual assumption is far from obvious. I return to
other aspects of this concept of “language” below.
How is knowledge of such a language acquired? As noted above, a
Humean theory will acquire substance if such notions as “similarity” are
characterized in some way. Quine therefore postulates a prelinguistic (and
presumably innate) “quality space” with a built-in distance measure (pp.
83-84). Evidently, the structure of this space will determine the content
NOAM CHOMSKY 265

of the theory of learning. For example, one could easily construct a theory
of innate ideas of a rather classical sort in terms of a prelinguistic quality
space with a built-in distance measure. Quine would, apparently, accept a
very strong version of a theory of innate ideas as compatible with his
framework. Thus he considers the possibility that “a red ball, a yellow ball,
and a green ball are less distant from one another in . . . the child’s . . .
quality space then from a red kerchief.” It is difficult to see how this differs
from the assumption that “ball” is an innate idea, if we admit the same
possibilities along other “dimensions” (particularly, if we allow these di¬
mensions to be fairly abstract). In this respect, then, Quine seems to depart
quite radically from the leading ideas that guided empiricist theory and to
permit just about anything imaginable, so far as “learning” of concepts is
concerned. In particular, consider the fact that a speaker of English has
acquired the concept “sentence of English.” Suppose that we were to
postulate an innate quality space with a structure so abstract that any two
sentences of English are nearer to one another in terms of the postulated
distance measure than a sentence of English and any sentence of another
language. Then a learner could acquire the concept “sentence of English”
—he could, in other words, know that the language to which he is exposed
is English and “generalize” to any other sentence of English—from an
exposure to one sentence. The same is true if we mean by “sentence of
English” a pairing of a certain phonetic and semantic interpretation. We
could, once again, construct a quality space sufficiently abstract so that the
infinite set of English sentences could be “learned” from exposure to one
sentence, by an organism equipped with this quality space.
The handful of examples and references that Quine gives suggests that he
has something much narrower in mind, however; perhaps, a restriction to di¬
mensions which have some simple physical correlate such as hue or bright¬
ness, with distance defined in terms of these physical correlates. If so, we
have a very strong and quite specific version of a doctrine of innate ideas
which now can be faced with empirical evidence.
It might be thought that Quine adds empirical content to his account by
his insistence that “the child’s early learning of a verbal response de¬
pends on society’s reinforcement of the response in association with the
stimulations that merit the response . . .” (p. 82) and his general insistence
throughout that learning is based on reinforcement. But, unfortunately,
Quine’s concept of “reinforcement” is reduced to near vacuity. For example,
he is willing to accept the possibility that “society’s reinforcement consists
in no more than corroborative usage, whose resemblance to the child’s
effort is the sole reward” (pp. 82-83). To say that learning requires rein¬
forcement, then, comes very close to saying that learning cannot proceed
without data. As Quine notes, his approach is “congenial ... to Skinner’s
scheme, for . . . [Skinner] . . . does not enumerate the rewards.” The re¬
mark is correct, but it should also be added that “Skinner’s scheme” is
266 2. STRUCTURE OF SCIENCE

almost totally empty, in fact, if anything even less substantive than Quine’s
version of it, since Skinner, as distinct from Quine, does not even require
that reinforcing stimuli impinge on the organism—it is sufficient that they
be imagined, hoped for, etc. In general, the invoking of “reinforcement”
serves only a ritualistic function in such discussions as these, and one can
safely disregard it in trying to determine the substantive content of what is
being proposed.
However, Quine returns to a classical empiricist conception of a non-
vacuous sort in his assumptions about how language is learned. Consistent
with his view of language as a network of sentences,4 he enumerates three
possible mechanisms by which sentences can be learned—i.e., by which
knowledge of language can be acquired (p. 9f.). First, sentences can be
learned by “direct conditioning” to appropriate nonverbal stimulations,
that is, by repeated pairing of a stimulation and a sentence under appropri¬
ate conditions; second, by association of sentences with sentences (let us
put aside the objection that, in both cases, the associations should soon
disappear, through extinction, under normal circumstances); third, new
sentences can be produced by “analogical synthesis.”5 The third method
at first seems to offer an escape to vacuity, once again. Thus if the first
sentence of this paper is derivable by analogical synthesis from “the sky is
blue” (both involve subject and predicate, are generated with their inter¬
pretations by the rules of English grammar, and share many other proper¬
ties), then it is no doubt true that language can be learned by “analogical
synthesis,” by “generalization” along a dimension of the abstract sort
suggested above (cf. p. 265). But it seems clear that Quine has nothing
of this sort in mind. The one example that he gives is a case of sub¬
stitution of one word for a similar one (“hand,” “foot”) in a fixed
context. And he seems to imply that the process of analogical synthesis
is theoretically dispensable, simply serving to speed matters up (see p. 9).
Therefore, we can perhaps conform to his intentions by totally disregard¬
ing this process, and considering the knowledge attained by a long-lived
adult using only the first two methods instead of the knowledge attained
by a young child who has used all three (there being nothing that can be
said about the latter case until the notion “analogical synthesis” is given
some content). Noting further that a child of nine and a man of ninety share
knowledge of language in fundamental respects—each can understand and
use appropriately an astronomical number of sentences, for example—it
would seem, further, that little is lost in omitting “analogical synthesis”
from consideration entirely, even for the young child. Assuming that this
interpretation of Quine’s remarks is correct, we derive support for the
conclusion that he regards a language as a finite network of associated
sentences, some associated also to stimuli, since this is just the structure
that would arise from the two postulated mechanisms of language learning.
This interpretation of Quine’s rather inexplicit comments on “analogic
NOAM CHOMSKY 267

synthesis” is supported further by his practice of referring to acquisition


of knowledge of language as a matter of “learning of sentences.” It is
unclear what sense there would be to the assertion that a person has
“learned” a sentence that takes twice as long to say as his entire lifetime.
Correspondingly, Quine’s manner of formulating the process of language
acquisition would permit no way of stating how this sentence might be
encompassed in the person’s knowledge of his language; thus Quine offers
no way to describe the fact that the person has acquired rules or principles
that determine the form and meaning of this sentence. If language-learning is
based on the three mechanisms enumerated, it can yield a network of
“associated” sentences, but not a generative system of rules and prin¬
ciples that determine the form and meaning of indefinitely many sentences.
“Learning of sentences” is not “learning of language.”
Against this interpretation of Quine’s remarks on language we can bring
the fact that it is inconsistent with a truism that he of course accepts,
namely, that a language is an infinite set of sentences (with intrinsic mean¬
ings; cf., e.g., p. 71). A network derived by the postulated mechanisms
must be finite; it can, in fact, contain only the sentences to which a person
has been exposed (repeatedly, and under similar circumstances). If we
return to the definition of “language” as a “complex of dispositions to verbal
behavior,” we reach a similar conclusion, at least if this notion is intended
to have empirical content. Presumably, a complex of dispositions is a struc¬
ture that can be represented as a set of probabilities for utterances in certain
definable “circumstances” or “situations.” But it must be recognized that
the notion “probability of a sentence” is an entirely useless one, under
any known interpretation of this term. On empirical grounds, the prob¬
ability of my producing some given sentence of English—say, this sen¬
tence or the sentence “Birds fly” or “Tuesday follows Monday” or what¬
ever—is indistinguishable from the probability of my producing a given
sentence of Japanese. Introduction of the notion of “probability relative
to a situation” changes nothing, at least if “situations” are characterized
on any known objective grounds (we can, of course, raise the conditional
probability of any sentence as high as we like, say to unity, relative to
“situations” specified on ad hoc, invented grounds). Hence if a language
is a totality of speech dispositions (in some empirically significant sense
of this notion), then my language either does not include the sentences just
cited as examples, or it includes all of Japanese. In fact if the “complex of
dispositions” is determined on grounds of empirical observation, then only
a few conventional greetings, cliches, and so on, have much chance of
being associated to the complex defining the language, since few other
sentences are likely to have a nonnull relative frequency, in the technical
sense, in any reasonable corpus or set of observations—we would, for
example, expect the attested frequency of any given sentence to decrease
without limit as a corpus increases, under any but the most artificial condi-
268 2. STRUCTURE OF SCIENCE

tions. One might imagine other ways of assigning probabilities to sen¬


tences on empirical grounds, but none, so far as I can see, that avoid these
difficulties. Hence if a language is a complex of dispositions to respond
under a normal set of circumstances, it would be not only finite (unless it
included all languages) but also extremely small.
Adding to the confusion is the fact that Quine appears to vacillate some¬
what in his use of the notion “speech dispositions.” Thus he formulates
the problem of “indeterminacy of translation” as resulting from the fact
that “manuals for translating one language into another can be set up in
divergent ways, all compatible with the totality of speech dispositions, yet
incompatible with one another” (p. 27). As just noted, if we take the
“totality of speech dispositions” of an individual to be characterized by
probability distributions for utterances under detectable stimulus condi¬
tions, then the thesis quoted is true, near-vacuously, since except for a
trivial set, all such probabilities will be empirically indistinguishable on
empirical grounds, within or outside of the language. On the other hand,
if we interpret the notion “disposition” and “situation” more loosely, it
might be argued that the problem is really quite different, that there will
be so few similarities among individuals in what they are inclined to say in
given circumstances that no manual of translation can be set up at all, com¬
patible with such inclinations. Actually, Quine avoids these problems in his
exposition by shifting his ground from “totality of speech dispositions” to
“stimulus meanings,” that is, dispositions to “assent or dissent” in a situa¬
tion determined by one narrowly circumscribed experiment. He even goes
so far as to say that this arbitrarily selected experiment provides all of the
evidence that is available, in principle, to the linguist (equivalently, to the
language learner—p. 39). Clearly, however, a person’s total “disposition to
verbal response” under arbitrary stimulus conditions is not the same as his
“dispositions to be prompted to assent to or dissent from the sentence”
under the particular conditions of the Gedankenexperiment that Quine
outlines. One might argue that by arbitrarily limiting the “totality of evi¬
dence,” Quine irrelevantly establishes the thesis that alternative theories
(manuals of translation) exist compatible with all of the evidence (though
the general thesis of indeterminacy of translation is nevertheless certainly
true, in a sense to which we return in a moment). But my point here is
only that this kind of vacillation makes it still more difficult to determine
what Quine means by “disposition” or “language.”
It is easy to imagine a way out of the difficulties posed by the implied
finiteness of language and knowledge (or near emptiness, if the notion of
“disposition” is taken very seriously). Thus one might assume that knowl¬
edge of a “universal grammar,” in the widest sense, is an innate property of
the mind, and that this given system of rules and principles determines the
form and meaning of infinitely many sentences (and the infinite scope of
our knowledge and belief) from the minute experiential base that is actually
NOAM CHOMSKY 269

available to us. I do not doubt that this approach is quite reasonable, but
it then raises the empirical question of the nature of this universal, a priori
system; and, of course, any philosophical conclusions that may be drawn
will depend on the answers proposed for this question.6 Quine’s attitude
toward an approach of this sort is not easy to determine. It certainly seems
inconsistent with his general point of view, specifically with his claim that
even our knowledge of logical truths is derived by conditioning mechanisms
that associate certain pairs of sentences (cf., e.g., p. Ilf.), so that our
knowledge of logical relations must be representable as a finite network
of interconnected sentences. (How we can distinguish logical connections
from causal ones or either type from sentences which happen to be paired by
accident in our experience is unclear, just as it is unclear how either sort
of knowledge can be applied, but it is pointless to pursue this issue in
the light of the strangeness of the whole conception.) Elsewhere, however,
Quine appears to take the view that truth-functional logic might pro¬
vide a knd of “universal grammar.” Thus he asserts (§ 13) that truth
functions lend themselves to “radical translation” without “unverifiable
analytical hypotheses,” and hence can be learned directly from the available
evidence. He gives no real argument for this beyond the statement, which
appears quite irrelevant to the factual issue involved, that we can state
truth conditions in terms of assent and dissent. The inference from what
we can observe to a postulated underlying structure involving truth-func¬
tional connectives of course requires assumptions that go beyond evidence—
mutually incompatible alternatives consistent with the evidence can easily
be constructed. Hence Quine’s willingness to place these matters within the
framework of radical translation perhaps indicates that he is willing to re¬
gard the system of truth-functional logic as available, independently of ex¬
perience, as a basis for language-learning. If so, it seems quite arbitrary to
accept this framework as innate schematism and not to admit much else
that can be imagined and described.7 In view of the unclarity of this matter
and the apparent inconsistency of the proposal just discussed with Quine’s
explicit characterization of “theory” and “language” and the mechanisms for
acquiring them, I will put aside any further consideration of this topic.
We are left with the fact that Quine develops his explicit notion of
“language” and “theory” within a narrowly conceived Humean framework
(except for the possible intrusion of a rich system of innate ideas), and
that he characterizes language-learning (“learning of sentences”) in a way
consistent with this narrow interpretation, although the conclusion that a
language (or theory) is a finite fabric of sentences, constructed pairwise by
training, or a set of sentences with empirically detectable probabilities of
being produced (hence a nearly empty set) is incompatible with various
truisms to which Quine would certainly agree.
Quine relies on his empirical assumptions about the acquisition of knowl¬
edge and learning of language to support some of his major philosophical
270 2. STRUCTURE OF SCIENCE

conclusions. One critical example will serve to illustrate. Fundamental


to knowledge are certain “analytical hypotheses” that go beyond the evi¬
dence. A crucial point, for Quine, is that the correctness of analytical
hypotheses, in the case of ordinary language and common sense knowl¬
edge,” is not “an objective matter” that one can be “right or wrong about.”
These analytical hypotheses “exceed anything implicit in any native’s disposi¬
tion to speech behavior.” Therefore, when we use these analytical hypotheses
(as we must, beyond the most trivial cases) in translating, in learning a lan¬
guage in the first place, or in interpreting what is said to us under normal cir¬
cumstances, we “impute our sense of linguistic analogy unverifiably to the
native mind.” The imputation is “unverifiable” in the sense that alternatives
consistent with the data are conceivable; that is, it is “strong verifiability”
that is in question. “There can be no doubt that rival systems of analytical
hypotheses can fit the totality of speech behavior to perfection, and can
fit the totality of dispositions to speech behavior as well, and still specify
mutually incompatible translations of countless sentences insusceptible of
independent control” (p. 72). These remarks Quine puts forth as the thesis
of “indeterminacy of translation.”
To understand the thesis clearly it is necessary to bear in mind that Quine
distinguishes sharply between the construction of analytical hypotheses on
the basis of data and the postulation of “stimulus meanings of observation
sentences” on the basis of data. The latter, he states, involves only un¬
certainty of the “normal inductive” kind (p. 68). The same is true, ap¬
parently about the inductive inference involved in translation (similarly,
“learning” and understanding) of sentences containing truth-functional
connectives. In these cases, induction leads us to “genuine hypotheses,”
which are to be sharply distinguished from the “analytical hypotheses” to
which reference is made in the discussion of indeterminacy of translation.
Hence Quine has in mind a distinction between “normal induction,” which
involves no serious epistemological problem, and “hypothesis formation”
or “theory construction,” which does involve such a problem. Such a dis¬
tinction can no doubt be made; its point, however, is less than obvious.
It is not clear what Quine is presupposing when he passes over the “normal
uncertainty of induction” as within the range of radical translation. If
clarified, this would add more content to his empirical theory of acquisi¬
tion of knowledge, by specification of the a priori properties on which
“normal induction” and the notions of relevant and sufficient evidence are
based. It would then be necessary for him to justify the empirical assump¬
tion that the mind is natively endowed with the properties that permit
“normal induction” to “genuine hypotheses” but not “theory construction”
with some perhaps narrowly constrained class of “analytical hypotheses.”
To return to the thesis of indeterminacy of translation, there can surely
be no doubt that Quine’s statement about analytical hypotheses is true,
though the question arises why it is important. It is, to be sure, undeniable
NOAM CHOMSKY 271

that if a system of “analytical hypotheses” goes beyond evidence, then it


is possible to conceive alternatives compatible with the evidence, just as in
the case of Quine’s “genuine hypotheses” about stimulus meaning and
truth-functional connectives. Thus the situation in the case of language, or
“common sense knowledge,” is, in this respect, no different from the case
of physics. Accepting Quine’s terms, for the purpose of discussion, we
might say that “just as we may meaningfully speak of the truth of a sen¬
tence only within the terms of some theory or conceptual scheme, so on the
whole we may meaningfully speak of interlinguistic synonymy8 only within
the terms of some particular system of analytical hypotheses” (p. 75). But,
Quine answers:
To be thus reassured is to misjudge the parallel. In being able to speak of the truth
of a sentence only within a more inclusive theory, one is not much hampered;
for one is always working within some comfortably inclusive theory, however
tentative. ... In short, the parameters of truth stay conveniently fixed most of
the time. Not so the analytical hypotheses that constitute the parameter of
translation. We are always ready to wonder about the meaning of a foreigner’s
remark without reference to any one set of analytical hypotheses, indeed even
in the absence of any; yet two sets of analytical hypotheses equally compatible
with all linguistic behavior can give contrary answers, unless the remark is of
one of the limited sorts that can be translated without recourse to analytical
hypotheses [pp. 75-76].
Thus what distinguishes the case of physics from the case of language is
that we are, for some reason, not permitted to have a “tentative theory” in
the case of language (except for the “normal inductive cases” mentioned
above). There can be no fixed set of analytical hypotheses concerning lan¬
guage in general. We need a new set for each language (to be more pre¬
cise, for each speaker of each language), there being nothing universal
about the form of language. This problem, then, is one that faces the
linguist, the child learning a language (or acquiring “common sense knowl¬
edge,” given the interconnection between these processes), and the person
who hears or reads something in his own language.
To summarize, Quine supposes an innate quality space with a built-in
distance measure that is, apparently, correlated to certain “obvious” physi¬
cal properties. Furthermore, certain kinds of inductive operations (involv¬
ing, perhaps, generalization in this quality space) are based on innate
properties of the mind, as are also, perhaps, certain elements of truth-
functional logic. Utilizing these properties, the child (or the linguist doing
radical translation) can form certain genuine hypotheses, which might be
wrong but are at least right-or-wrong, about stimulus meanings and truth-
functional connectives. Beyond this, language-learning (acquisition of knowl¬
edge) is a matter of association of sentences to one another and to certain
stimuli through conditioning, a process which results in a certain network
of interconnected sentences or, perhaps, a certain system of dispositions to
respond. Language-learning is a matter of “learning of sentences.” It is
272 2. STRUCTURE OF SCIENCE

impossible to make significant general statements about language or com¬


mon sense theories, and the child has no concept of language or of “com¬
mon sense” available to him prior to his training. In this respect, the study
of language is different from, let us say, physics. The physicist works within
the framework of a tentative theory. The linguist cannot, nor can the psy¬
chologist studying a “conceptual system” of the common sense variety,
just as the child can have no “tentative theory” that guides him in acquiring
and learning from experience. Apart from difficulties of interpretation noted
above, this is a relatively clear formulation of a classical empiricist doc¬
trine. It involves assumptions which may or may not be true, but for which
Quine does not seem to regard evidence as necessary.
Let us briefly consider these empirical assumptions. It is, first of all,
not at all obvious that the potential concepts of ordinary language are
concepts characterizable in terms of simple physical dimensions of the
kind Quine appears to presuppose, or conversely. It is a question of fact
whether the concept “house” is characterized, for a speaker of a natural
language, as a “region” in a space of physical dimensions, or, as Aristotle
suggested, in terms of its function within a matrix of certain human needs
and actions. The same is true of many other concepts, even the most
primitive. Is a knife, to a child with normal experience, an object of such
and such physical properties or an object that is used for such and such
purposes; or is it defined by an amalgam of such factors, say as an object
meeting certain loose physical conditions that is used for a certain sort of
cutting? How would we in fact identify an object looking exactly like a
knife but used for some totally different purpose in some other culture?9
This is as much an empirical question as the question whether concepts
characterized in terms of a region in a space of simple physical dimensions
can be acquired in the way a child acquires his concepts. There is much
to be said in this connection,10 but it is enough to note in the present
context that Quine’s empirical assumptions may well be (I believe, certainly
are) far too strong—more correctly, too strong in the wrong direction—
and that they embody certain quite gratuitous factual assumptions.
Furthermore, consider the idea that “similarity” in a sense appropriate
for psychology, the kind of “similarity” needed for an empirical theory of
generalization, is definable in terms of distance in a certain space of physical
dimensions. There is nothing obvious about this assumption. Two two-
dimensional projections of a three-dimensional object may be “similar,” in
the relevant sense, for an organism that has an appropriate concept of the
three-dimensional object and its properties and an intuitive grasp of
the principles of projection, although there is no dimension of the pre¬
supposed sort along which such stimulations match. We could easily design
an automaton which would generalize from one such presentation to an¬
other, but not from one of these to a projection of some other three-dimen¬
sional object that matched the first in some simple physical dimension. We
NOAM CHOMSKY 273

could, of course, describe the behavior of this automaton in terms of a


more abstract quality space, just as we could describe an automaton that
learned English from a single sentence in these terms—see page 265, above.
But this is only to say that it is an empirical problem, quite open for the
time being, to determine what are the innate properties of mind that de¬
termine the nature of experience and the content of what comes to be known
on the basis of (or independently of) this experience.
As far as “learning of sentences” is concerned, the entire notion seems
almost unintelligible. Suppose that I describe a scene as rather like the
view from my study window, except for the lake in the distance. Am I
capable of this because I have learned the sentence: “This scene is rather
like the view from my study window, except for the lake in the distance”?
To say this would be as absurd as to suppose that I form this and other
sentences of ordinary life by “analogical substitution,” in any useful sense
of this term. It seems hardly necessary to belabor the point, but surely it is
clear that when we learn a language we are not “learning sentences” or ac¬
quiring a “behavioral repertoire” through training. Rather, we somehow
develop certain principles (unconscious, of course) that determine the form
and meaning of indefinitely many sentences. A description of knowledge of
language (or “common sense knowledge”) as an associative net constructed
by conditioned response is in sharp conflict with whatever evidence we
have about these matters. Similarly, the use of the term “language” to refer
to the “complex of present dispositions to verbal behavior, in which speakers
of the same language have perforce come to resemble one another” seems
rather perverse. Asssuming even that the problems noted earlier (p. 268)
have been overcome, what point can there be to a definition of “language”
that makes language vary with mood, personality, brain lesions, eye injuries,
gullibility, nutritional level, knowledge, and belief, in the way in which
“dispositions to respond” will vary under these and numerous other ir¬
relevant conditions.11 What is involved here is a confusion to be found
in much behaviorist discussion. To mention just one further example, con¬
sider Quine’s remarks on synonymy in his “Meaning in Linguistics.”12
Here he proposed that synonymy “roughly consists in approximate likeness
in the situations which evoke two forms and approximate likeness in the
effect on the hearer.” If we take the terms “situation” and “effect” to refer
to something that can be specified in terms of objective physical properties,
as Quine would surely intend (say as involving observable stimulus condi¬
tion and observable behavior or emotional state, respectively), then the
qualifications in the characterization of synonymy just quoted seem mis¬
placed, for there is not even approximate likeness in the conditions that
are likely to elicit (or to serve as occasion for) synonymous utterances or
in the effects of such utterances. Suppose that I see someone about to
fall down the stairs. What would be the probability of my saying: “Watch
out, you’ll fall down the series of steps, arranged one behind and above
274 2. STRUCTURE OF SCIENCE

the other in such a way as to permit ascent or descent from one level to
another”; and what would the effect on the hearer be in this case? Or con¬
sider the likely circumstances and effects of “I’ll see you the day after
tomorrow,” “I’ll see you four days after the day before yesterday.” This
is not a matter of exotic examples; it is simply that the meaning of a
linguistic expression (hence synonymy) cannot be characterized in terms
of conditions of use or effects on hearers, in general. It is crucial to dis¬
tinguish langue from parole, competence from performance.13 What a per¬
son does or is likely to do and what he knows may be related in some way
that cannot, for the moment, be made precise; the relation is, however,
surely in part a factual and not a strictly conceptual one. Performance can
provide evidence about competence, as use can provide evidence about
meaning. Only confusion can result from failure to distinguish these separate
concepts.
Finally, what about the assumption that although in physics we may
work within the framework of a tentative theory, in studying language (or
in learning language or in translating or interpreting what we hear), this
is not possible since it is impermissible to make general statements about
language or, more generally, about our “common sense theories and since
innate properties of the mind can impose no conditions on language and
theories?14 This is simply classical empiricist doctrine—perhaps “dogma”
would, by now, be a more accurate term. It is difficult to see why this
dogma should be taken more seriously than any other. It receives no sup¬
port from what is known about language-learning or from human or com¬
parative psychology. If it held true of humans, they would be unique in
the animal world; and there is no evidence for this particular type of unique¬
ness. In general, it seems to me correct to say that insofar as empiricist
doctrine has clear psychological content, it is in conflict with the not in¬
considerable information that is now available. In any event, returning to
the present theme, the particular assumptions that Quine makes about the
mental processes and structures that provide the basis for human language¬
learning are quite unwarranted and have no special status among the many
assumptions that can be imagined. They can be justified only by empirical
evidence and argument. Philosophical conclusions based on these assump¬
tions are no more persuasive than the evidence on which the assumptions
rest; that is to say, for the present these conclusions are without force.
Interpreted in a psychological context, then, Quine’s thesis of indeter¬
minacy of radical translation amounts to an implausible and quite unsub¬
stantiated empirical claim about what the mind brings to the problem of
acquisition of language (or of knowledge in general) as an innate property.
This claim seems to me of only historical interest. Interpreted in an
epistemological context, as a claim about the possibility of developing
linguistic theory, Quine’s thesis is simply a version of familiar skeptical
arguments which can be applied as well to physics, to the problem of
NOAM CHOMSKY 275

veridical perception, or, for that matter, to his “genuine hypotheses.” It


is quite certain that serious hypotheses concerning a native speaker’s
knowledge of English, or the essential properties of human language—the
innate schematism that determines what counts as linguistic data and what
intellectual structures are developed on the basis of these data—will “go
beyond the evidence.” If they did not, they would be without interest.
Since they go beyond mere summary of data, it will be the case that there
are competing assumptions consistent with the data. But why should all
of this occasion any surprise or concern?
There are other examples of empiricist speculation of a rather similar
sort in modern philosophy of language. Wittgenstein’s statements about
language learning in the Blue and Brown Books are a fairly clear example,
if taken literally.15 He speaks of his “language games” as “the forms of
language with which a child begins to make use of words,” as “primitive
forms of language or primitive languages,” and he goes on to assert that
“we can build up the complicated forms from the primitive ones by gradually
adding new forms” (p. 17). He claims that “children are taught their
native language by means of such games” (p. 81). Elsewhere, he speaks
of children as learning language by “training” in the sense of “animal
training,” i.e., “by means of example, reward, punishment, and such like”
(p. 77). Although the cited example is a primitive and restricted one, it
is also one that Wittgenstein takes quite seriously, as we can see from later
references, specifically, §17, §50, §55. Furthermore, the notion of “training”
persists without qualification; cf., e.g., §7, §18. Later, “training” is distin¬
guished from “general training” (cf. §41, §51), but the latter notion is
left virtually unexplained. Although Wittgenstein does refer to the fact that
training presupposes a mind that understands (p. 97), he makes no attempt
to analyze or explore the significance of this fact or to determine its
specific role in determining how knowledge is acquired. (And if I may
interpolate a value judgment, it is just when these questions are raised that
the problems become challenging and of general intellectual significance.)
At the very outset of his discussion (p. 1), he divides explanations of
meaning (“very roughly”) into verbal and ostensive; elsewhere, he main¬
tains that words have meanings only when “we give them meanings by
explanations” (pp. 27-28). Since the verbal definition “in a sense gets
us no further,” it is ostensive definitions that are crucial for language¬
learning. Ostensive definitions can be of two sorts (p. 12): The teaching
may be “a drill,” or it may supply us with a rule which is explicitly and,
presumably, consciously used as “part of the calculation.” Since the latter
case is almost nonexistent, ostensive definition, in practice, reduces to drill.
Putting these remarks together, Wittgenstein appears (on a literal reading)
to be claiming that language is taught almost wholly by drill and that
knowledge of language grows by training in new “language games” of
the peculiar sort that he describes, the process of training being essentially
276 2. STRUCTURE OF SCIENCE

that by which animals can be given an extended “behavioral repertoire.


These are, surely, empirical claims. One who takes them seriously should
be willing to show, at least in a rough or suggestive way, how ordinary
knowledge of language, the kind of knowledge that enables one to under¬
stand some simple new sentence, can be described in terms of the mech¬
anisms exhibited in the curious examples of “language games that are
presented as paradigmatic, as constituting the core of language to which
bits and pieces are added by further “training” and drill. It seems hardly
likely that one who faces the actual facts of language use will be inclined
to undertake this task. Surely Wittgenstein does not.
Do these quite unsupported factual assumptions play a role in establish¬
ing any of Wittgenstein’s conclusions of a philosophical nature? This is
hard to say, given the narrow limitations that Wittgenstein places on phi¬
losophy. Insofar as Wittgenstein is merely stating the limitations within
which he chooses to work (collecting many examples, indicating their
various similarities and differences, etc.), there are, as he points out, no
conclusions, hence no reference to a framework of assumptions. It is only
when some assertions are made about the relevance or importance of this
activity, or about certain empirical or conceptual problems, that the ques¬
tion of rational evaluation arises. It seems (subject to the qualifications of
footnote 15) that there are such examples, and that his empirical assump¬
tions do play some role. One of Wittgenstein’s rare categorical assertions
has to do with the presumed absurdity of “investigating, analyzing, the
meaning of a word” (pp. 27—28). He asserts that “a word hasn’t got a
meaning given to it, as it were, by a power independent of us, so that there
could be a kind of scientific investigation into what the word really means.
A word has the meaning someone has given to it.” As noted above, his
view is that “we give them [words] meanings by explanations.” The refer¬
ence is to conscious, explicit explanations of meanings that have been given
either in language-teaching or that the speaker will give when asked. If
we are not ready to give any explanation, then the word does not “have a
strict meaning.” This point of view is central to his extremely narrow con¬
ception of language acquisition, and much of what he purports to show,
or at least what others have taken him to show, is dependent on these
assumptions about the meaning of words and how these meanings arise.17
It is important to disentangle the factual element in this. Surely there is
no difficulty in principle in imagining an organism that approaches the task
of language-learning with a rich system of constraints on the possible gram¬
matical rules, possible concepts and their interrelations and relations to
sensory evidence, the characteristics of physical objects and the ways in
which they interact, the structure of space and time, the organization of
human action and the relation of empirical concepts to certain types of
human action, a theory of human needs, emotions, feelings, motives, and
so on. For such an organism, we would surely want to say that the mean-
NOAM CHOMSKY 277

ings of words are given to it in part by a power independent of any


conscious choice; and a scientific investigation of the conditions that deter¬
mine how an elaborate semantic and syntactic system was constructed on
the basis of evidence would certainly be quite in order. We could imagine
an organism so richly endowed that on scanty evidence it would acquire
full knowledge of a human language in all its scope and subtlety, quite
unconsciously, and with no element of voluntary choice and no ability to
give a conscious account (explanations) of the structure of its system of
concepts or its system of beliefs, the interrelations among these systems,
the strict rules by which it uses language, and so on. The richer the initial
endowment, the more limited can be the evidence on the basis of which
these mental structures are established. We know, in fact, that some evi¬
dence is necessary—there is more than one possible human language. Beyond
this, it is an empirical question what variety is possible, what is the nature
of the mental structure that develops, and to what extent (if any) its
development is more subject to voluntary control that the fact that under
certain physical conditions a human embryo will grow two legs. A rejection
of these possibilities, as seems to be entailed by a strict reading of the
comments of Wittgenstein quoted above, would therefore be purely dog¬
matic. In fact, from what little we do know about language, it is hard to
see how one can seriously doubt that the meaning of words is given largely
by a power independent of conscious choice and that language is both used
and learned,18 in accordance with strict principles of mental organization,
largely inaccessible to introspection, but in principle, at least, open to in¬
vestigation in more indirect ways.
It seems quite clear that Wittgenstein intends that there be no empirical
assumptions entering into his reasoning and analysis, that “there must not
be anything hypothetical in our considerations” (Investigations I, p. 109).
Nevertheless, as the example just given indicates, I think one can make a
case that certain fairly central conclusions do, in fact, rest on empirical
hypotheses which are, furthermore, of a very questionable sort. It would
be interesting, but beyond the scope of this paper, to explore the matter
further in detail. Possible further directions are suggested by some of the
concrete examples that Wittgenstein discusses. Wittgenstein declares re¬
peatedly that he is concerned to rid us of the temptation to search for
particular mental acts accompanying thinking, meaning, wishing, expect¬
ing, and so on, a temptation which, he suggests, will disappear if we cease
to imagine the whole system of language (and belief) as a kind of permanent
background to what we say, a system that is “present to the mind” all at
once and that participates in determining the meaning of each individual
utterance that is produced or understood (Blue and Brown Books, p. 42).
And he focuses attention rather on what actually happens in particular
circumstances, for example, when one expects B to come to tea. As he cor¬
rectly observes, many things may happen, with endless variations—these
278 2. STRUCTURE OF SCIENCE

he refers to, curiously, as “the different processes of expecting ^someone to


tea” (p. 20). Having collected and described various cases of expecting,
our investigation, as philosophers, is not necessarily at an end (we can go
on to collect other more complicated cases); but, Wittgenstein holds, we
must emancipate ourselves from the “craving for generality” which makes
us dissatisfied with a collection of examples and leads us on a vain and
misguided search for the “common element” in all the applications o a
general term. However, there is much that is amiss in this description of
“expecting.” For one thing, it makes no sense at all to refer to^ “processes
of expecting” (or to exclamations as being “acts of expecting, as in n-
vestigations I, p. 586); and it is unclear what significance there is to the fact
that many different things (or, nothing of relevance) may be happening
during the period that one expects B to come to tea. I may be going about
my business quite as usual without expecting B to come to tea any less,
and alongside of expecting B to come to tea, I may also be expecting in¬
numerable other things (e.g., that B will be less than twelve feet tall, that
he will greet me warmly, that if admitted to Harvard, he will accept, etc.).
Most of these expectations I will not be aware of at all; I may find out about
them if, for example, something happens that is contrary to my expecta¬
tions,19 or I may never find out about them. In short, expecting, like e-
lieving, seems to involve, among other things, a finite generative mental
structure that characterizes an unbounded set of possibilities. Limiting our¬
selves to observations of what, if anything, a person does “while expecting
something,” we will not even be able to raise the issues posed by an analysis
of the concept “expect” and its position in the “full calculus of language.”
Insofar as this is correct, it is not at all absurd to speak of an “image of the
world” that is available to the mind (though not in consciousness, for
the most part), and to think of the content of an assertion as determined
partly by this system of belief and partly by the full calculus of language that
determines the intrinsic meaning of each linguistic expression. The proc¬
esses” and “acts” that Wittgenstein describes may provide the evidence
that leads us to say that A expects B to come to tea—they justify us in
making this assertion. But they do not constitute the meaning or content
of this assertion.
Perhaps this is all irrelevant to what Wittgenstein is trying to do, however.
Thus at several points he makes a distinction between “conscious mental
phenomena” and states of the mind that are postulated as hypothetical
constructs, to explain the conscious mental phenomena and much else.
The latter are the subject matter for psychology—they are involved in de¬
termining “causes”—but are not the business of the philosopher, who is
interested only in what “lies open to view,” specifically with the description
of a family of uses of a word. Thus he is concerned with a certain network
of related events but not with the hypothetical mechanisms of mind that
might explain them. Wittgenstein does not argue that there are no explana-
NOAM CHOMSKY 279

tory mechanisms of this sort. Rather, he holds that these “internal work¬
ings of the mind” do not provide the criterion for the correct use of an
expression, and furthermore, that there may be no “conscious mental
accompaniments” to acts and no statable reasons for performing them. In
general, the philosopher’s “method is purely descriptive; the descriptions
we give are not hints of explanations” (p. 125; cf. also Investigations I,
§§ 109, 126; also the discussion in § 156 of postulated mechanisms that
“are only hypotheses, models designed to explain,” as opposed to pure
descriptions—the word “only” is curious here).
Thus consider the act of reading. The philosopher is concerned to dispel
the temptation “to regard the conscious mental act as the only real criterion
distinguishing reading from not reading” (p. 121), and his “explanation
of the use of this word . . . essentially consists in describing a selection of
examples exhibiting characteristic features” (p. 125). This selection of
examples will, presumably, provide (or suggest to a human intelligence) the
criterion for distinguishing reading from not reading. When Wittgenstein
refers to the “criterion for distinguishing reading from not reading,” he no
doubt has in mind criteria which relate to the meaning of the word “read,”
empirical conditions which are conceptually related in some manner to
correct (i.e., true) assertion that so and so is reading, etc. But the criteria
that Wittgenstein actually discusses, both here and elsewhere, are not in
fact “criteria for correct assertion” in this sense, but rather “criteria for
justified assertion,” that is, conditions under which a rational person would
be justified in stating, possibly erroneously, that so and so is reading, and
so on.20 These are criteria in the sense in which having a certain visual
image might serve as a criterion that justifies my asserting that there is an
oasis over there while walking through the desert, although my perfectly
justified assertion might still be incorrect. It is, however, surely criteria
for correct assertion that relate to problems of meaning. An account of the
meaning of “oasis” will involve reference to the presence of trees, water, and
so forth, not merely to the evidence, however persuasive, that justifies a
rational man in saying that he sees an oasis. And just the same is true of
the case of “reading.” In short, it appears that Wittgenstein’s rather obscure
use of the notion “criterion” may indicate a belief rather akin to a belief
in the possibility of phenomenalist reduction, a belief that meanings of
words (and criteria for correct assertion) are some sort of “logical con¬
struction” (in a loose way, with various reservations, etc.) from criteria that
justify assertion, that is, from evidence. At least, it seems difficult to make
sense of his examples on any assumption other than this.
To return to the case of reading, we may ask whether anyone is subject
to the temptation to regard a conscious mental event as the criterion for
justified assertion, in the manner that Wittgenstein describes; is there, in
fact, a temptation to be dispelled? Rather, it seems that there is a tempta¬
tion to regard the unconscious mental state attributed to A as the criterion
280 2. STRUCTURE OF SCIENCE

for correct (not justified) assertion of the statement that A is reading. And
Wittgenstein suggests no argument to indicate that this temptation is in any
way misguided. Perhaps, in fact, I have a (no doubt in part unconscious)
theory involving the postulated mental states of humans performing certain
acts such as reading, etc., which is related to my (also unconscious) system
of linguistic rules in such a way that I assert that A is reading when I be¬
lieve him to be in such a mental state, and my assertion is correct if my
belief is correct. In any event, there seems to be little point in arguing at
length that observed behavior provides the evidence that leads us to state
that A is reading and that justifies us in making the (possibly incorrect)
statement that A is reading. This is simply to prove the obvious. Nor is it
necessary to elaborate on the fact that such observed behavior falls into
vaguely related families-, it is of the nature of bits of evidence to be frag¬
mentary, confusing, partial, loosely related, lacking sharp boundaries, etc.,
that is, to exhibit only “family resemblances.” This observation sheds no
light on the question whether I am attributing a “criterial mental state” to
a person whom I describe as reading, perhaps applying a tacit theory of
human action that guides my assessments and judgments. Much as in the
case of “expecting,” it is questions of the latter sort that might possibly
lead to an understanding of the concept “reading” and its position in the
“calculus of language.”
There is a curious frustration in the attempt to explore and understand
Wittgenstein’s thought. His examples and remarks, often brilliant and
perceptive, lead right to the border of the deepest problems, at which point
he stops short and insists that the philosopher can go no further. Ostensive
definitions do not carry us very far toward understanding how concepts are
learned, and simple “language games” barely illustrate the most superficial,
largely instrumental uses of language. In general, evidence of use is merely
a prerequisite for the study of meaning, as evidence of performance is merely
a prerequisite for the study of competence. To make interesting and intelli¬
gent use of evidence, one has to turn to the study of what it means to
understand a language,” of the nature of the “forms of life” that deter¬
mine the “natural history of man,” and of the nature of the mind that under¬
stands, of “that mechanism . . . born with B, which enabled him to
respond to the training in the way he did” (p. 97). Why, then, does Wittgen¬
stein’s discussion break off where it does, and why does he impose such
deadening limitations on the course that the philosopher may pursue?
Returning to the main theme, if we interpret Wittgenstein as intending
literally what he formulates as factual assertions (about language-learning,
for example), then what he says seems to fall within the framework of a
narrow and dogmatic empiricism. If we interpret him as merely circum¬
scribing the task of the philosopher, limiting it to a “purely descriptive
method,” to descriptions which “are not hints of explanations,” then what
he is proposing falls together with other, quite independent tendencies in
NOAM CHOMSKY 281

recent study of language, specifically, with certain tendencies in descriptive


linguistics which are also concerned to limit investigation to arrangement of
data and “pure description” that avoids any attempt at explanation (the
latter being occasionally stigmatized as a kind of infantile obsession). Both
approaches make the curious, and I believe stultifying, decision to con¬
centrate on evidence, regarded now as the subject matter of a new discipline
(descriptive philosophy, descriptive linguistics), putting aside the question
of what the evidence is evidence for. The traditional answer to this question
was that the observed phenomena constitute evidence for an underlying
mental reality; it would not have surprised any traditional theorist of lan¬
guage or mind that evidence falls into unilluminating networks of family
resemblances. It remains to establish the fact that there is some point in
restricting one’s activities to arrangement of data which are no longer re¬
garded as evidence for the construction of a theory of language or a theory
of mind.
There are, of course, very significant differences between Quine and
Wittgenstein in their approach to language, mind, and behavior. There also
appear to be certain similarities in tacit empirical assumptions and also, ap¬
parently, similarities of a more programmatic nature. Quine consciously and
purposefully follows modern behaviorism in restricting his concept of “what
is learned” to a certain system of dispositions to behave, a habit system, a
network of associations. Since this is an entirely inadequate concept of
what is learned, his account, like that of modern behaviorism, is largely
irrelevant to the problem of acquisition of language (or knowledge, or
belief), whatever merits it may have in its own terms. In a parallel way,
Wittgenstein (similarly, much of modern descriptive linguistics) explicitly
restricts himself to descriptions that do not offer even the hint of an ex¬
planation. By thus restricting himself to data in and for itself, as the subject
matter of the philosopher’s exclusive attention, he necessarily turns away
from many interesting and significant questions about the mental reality
(language, systems of belief, the basis for perception, etc.) that might be
illuminated by use of this descriptive material not merely as data but as
evidence. In both cases, we find a restriction of attention to behavior, a
studied refusal to examine and elaborate the mental structures21 that underlie
observed performance.22 There can be no objection in general to restriction
of attention to a limited subject matter, but one must ask always whether
the domain that is delimited is viable and significant. In this case, serious
doubts are in order. In linguistics, I think that the restriction to description
that does not give a hint of explanations has been damaging, and an argu¬
ment can be given that the same is true more generally in psychology. It
is doubtful that any serious insight into the nature and organization of be¬
havior can be achieved within these limitations; and furthermore, it is far
from clear why an understanding of the organization (or control) of be¬
havior should be regarded as comparable in importance or intellectual inter-
282 2. STRUCTURE OF SCIENCE

est to an understanding of the underlying mental reality that can be illumi¬


nated by the use of behavior as evidence rather than as the object of study.
In the specific case of Quine and Wittgenstein, it seems to me that the
restrictions that they impose simply exclude from serious study the many
fascinating questions that they themselves raise. Classical empiricism can,
I think, be reasonably interpreted as an interesting and substantive theory
of mind, which is, however, wrong in its specific assumptions and misguided
in principle. But its modern variants, in philosophy or in “behavioral
science,” sometimes reveal a distressing tendency to exclude in principle
the kinds of endeavor that might some day significantly enrich our under¬
standing of man’s essential qualities and their remarkable manifestations.

NOTES
1. I am indebted to Donald Brown, Jerrold Katz, and Charles Chihara for com¬
ments on an earlier version of this paper. This work was supported in part by The
National Science Foundation under grant GS-1430.
2. Formal parallels between kinds of explanations offered in learning theory
and evolutionary theory have frequently been pointed out. See, for example, B. F.
Skinner, “The phylogeny and ontogeny of behavior,” Science, Vol. 153, No. 3741,
9 Sept., 1966, 1205-1213, for recent discussion. Some earlier ideas are discussed by
W. H. Thorpe, Learning and Instinct in Animals, Methuen, London, 2d edition, 1963,
p. 164f. A parallel that is too infrequently noted is the difficulty, in both cases, of
constructing a nonvacuous thesis of any generality.
3. W. V. O. Quine, Word and Object, John Wiley and Sons, New York, and
Technology Press, Cambridge, 1960.
4. Accepting, that is, the interpretation of his remarks that is discussed above.
5. Elsewhere, Quine states that “the learning of these wholes (sentences) pro¬
ceeds largely by an abstracting and assembling of parts” and that “as the child
progresses, he tends increasingly to build his new sentences from parts” (p. 13). For
consistency of interpretation, we must suppose that this refers to “analogical syn¬
thesis,” since the three methods enumerated are intended to be exhaustive. If some¬
thing else is intended, then the scheme again reduces to vacuity, until the innate basis
for the “abstracting” and “assembling” is specified.
6. It is interesting that Russell, in his Inquiry into Meaning and Truth, Allen
and Unwin, London, 1940, with his concept of real logical form and of logical words
as expressing a mental reality, does appear to presuppose a structure that would
avoid at least these very obvious problems. But a discussion of Russell’s quite in¬
tricate and interesting approach to these questions, though a useful undertaking, is
impossible within the scope of this paper.
7. The reasons for this choice would take us too far afield, into a much more
general consideration of Quine’s thesis, developed later in the book, about the scheme
of discourse that one must use in “limning the true and ultimate structure of reality”
(p. 221), and in describing “all traits of reality worthy of the name” (p. 228).
8. Recall, again, that Quine is using the concept of “interlinguistic synonymy”
as a device for describing not only translation but also learning of language in the
first place and interpretation of what is said to him by one who knows a language.
9. Cf. Philippa Foot, “Goodness and choice,” Proceedings of the Aristotelian
Society, Supplementary Volume 35; 45-60. She comments, correctly I am sure, that
we would describe such objects as looking exactly like knives, but being something
else. See also the remarks by J. Katz on such words as “anesthetic” in his “Semantic
theory and the meaning of ‘good,’” Journal of Philosophy, Vol. 61, No. 23, Dec. 10,
1964, pp. 739-66.
10. Consider, for example, the experimental evidence that has been produced
purportedly showing differences between apes and humans in ability to carry out
cross-modal transfer. The difference is sometimes attributed to the “linguistic tags”
available to the human. (Cf. A. Moffet, and G. Ettlinger: “Opposite responding in
two sense modalities,” Science, No. 3732, 8 July, 1966, 205-206, and G. Ettlinger,
in Brain Mechanisms Underlying Speech and Language, F. L. Darley, ed., in press).
NOAM CHOMSKY 283

Another possibility that suggests itself is simply that the “concepts” used in the
experimental situation, being defined in terms of conjunction or disjunction of
elementary physical properties (as is the general procedure in concept-formation
experiments), are entirely artificial and mismatched to the “concept space” of the
tested animal. The human subject, however, imposes his own system of concepts
(since he understands what the experiment is about, etc.). Under the conditions of
the experiment, the distinction between the artificial concepts of the experimenter
and the natural concepts of the subject might well be undetectable. Hence it might
be that no difference between apes and humans in cross-modal transfer (and nothing
about linguistic tags) has yet been shown by such experiments, and that what is
shown is merely that an animal (or human) cannot make reasonable use of concepts
that are mismatched to the innate structure of his system of concepts.
11. Of the cited conditions, the one that might be regarded as relevant is “knowl¬
edge and belief.” Thus it makes sense to argue that under certain conditions, a change
in belief may entail a modification of language. But surely it is senseless to hold that
wherever difference of belief leads to a difference of disposition to verbal behavior,
there is necessarily a difference of language involved.
12. In From a Logical Point of View, Harvard University Press, Cambridge, 1953.
13. The issue is not simply one of observation versus abstraction but rather one
of significant versus pointless idealization. A set of dispositions to respond is a
construction postulated on the basis of evidence, just as is a generative grammar
that attempts to characterize “knowledge of a language.” In Quine’s terms, the first
is based on “genuine” and the second on “analytical” hypotheses, but only in a sense
of “genuine” that is divorced from its ordinary meaning (or else on the basis of a
value judgment that seems to me quite unsupportable). It would be more accurate
to say that setting up a “complex of dispositions to respond” is merely a pointless
step, since such a structure has no interesting properties, so far as is known.
14. Except, as noted earlier, for the constraints imposed by the structure of the
quality space, the system of truth-functional logic, certain primitive forms of in¬
duction, and the capacity to form arbitrary associations.
15. One difficulty in interpreting Wittgenstein, however, is that it is unclear when
what he says is to be taken literally. Some remarks are so outrageous that one can
only suppose that something else was intended (e.g., when he asserts that thinking
may be the “activity performed by the hand, when we think by writing”). A second
difficulty is the indefinite and noncommital style. Still, remarks such as those cited
seem to indicate a fairly specific point of view.
Still another difficulty, in the specific case of the Blue and Brown Books is, of
course, the question whether in detail they do or do not reflect Wittgenstein’s views.
In the Philosophical Investigations, the positions that I want to discuss are less
prominent than in the Blue and Brown Books, though I find little there that might
challenge these positions. In any event, what I will be discussing is the text of the
Blue and Brown Books, which may or may not express what Wittgenstein really
believed about these matters.
16. This kind of terminology may suggest, erroneously, that animal training is
simply a matter of adding arbitrarily selected bits of behavior to a collection of
“habits.” Actually, such terms are appropriate to a description of animal training
only at the most superficial level. For some interesting comments on this matter, see
K. Breland and M. Breland, “The misbehavior of organisms,” American Psychologist,
16, 1961, 681-84; also Thorpe, op. cit., p. 461-2 and references cited there.
17. Similar ideas appear elsewhere. For example, D. F. Pears argues that naming
is ultimately inexplicable, “that any comprehensive explanation of naming is neces¬
sarily circular” (“Universals,” in A. Flew, ed., Logic and Language, second series,
Basil Blackwell, Oxford, 1953). Ostensive definitions clearly do not in themselves
explain the basis for naming, and verbal explanations do not give us an exit from the
“maze of words.” “Naming cannot be explained by anything which really goes beyond
a reasoned choice of usage,” since concepts are “completely identifiable only by their
use.” All that we can say about a “well-constructed series of things” (i.e., a set of
things named by a word in a single sense) is just that it is well constructed. No
comprehensive answers can be given to the question of what is common to such
sets or how such “nameable” sets are distinguished, on general grounds, from other
sets. Experimental psychology can only give “the varying tests of the good con¬
struction of a series, and not its essence.” It cannot, then, characterize the essential
properties of “generalization” or “concept formation,” for a given organism, pre¬
sumably. The desire to explain naming, beyond recording details of usage or giving
tests, is “the result of the Protean metaphysical urge to transcend language.”
Whether a concept is identifiable in terms of its “use” or in terms of “a reasoned
284 2. STRUCTURE OF SCIENCE

choice of usage” is impossible to discuss, given the vagueness of these notions, but
it certainly seems clear that the problem of determining what is a well-constructed
series of things” for a particular organism is one that can be studied in a serious
way. The varying tests provided by experimental psychology can provide evidence,
as can observation of usage and introspection. And on the basis of such, evidence,
one can try to formulate a theory regarding the “attainable concepts’ and the
“attained concepts,” in particular, the system of concepts embedded in a person’s
internalized grammar. Evidently, the theory that we construct about a person s
grammar (or about attainable grammars) will be underdetermined by evidence,
since the task is not a triviality—the concepts that we attribute to him will not be
“completely identifiable,” in any interesting sense, by observed usage, without the
intrusion of certain assumptions of a theoretical nature. Having arrived at a tentative
theory concerning a person’s grammar, we can consider the data on the basis of
which he constructed this internalized grammar, which incorporates, in particular, a
system of concepts. Finally, we can construct a hypothesis as to the a priori system
of principles, conditions, and assumptions that led the person to construct this system
from the given data (where the process is, no doubt, not only unconscious but
also, quite likely, inaccessible to consciousness, and, quite likely, entirely deter¬
ministic insofar as the vast central areas of language are concerned). Such a hy¬
pothesis wdl specify, correctly or incorrectly, what can be “well-constructed sets”
for a human, and will offer an explanation as to why a particular term has a specific
referential scope. The problem of “explaining naming,” so formulated, may be largely
intractable, but it does not reflect a “Protean metaphysical urge” any more than the
attempt to study various other, for the moment relatively intractable problems, for
example, problems involving determinants of maturation and growth or origin of
species, etc.
18. Though certainly language is not taught in accordance with strict rule. In fact,
there is no reason to suppose that language need be taught at all.
19. But the matter is not simple. Thus one must distinguish expecting someone
from expecting of someone that he will do something, or simply expecting that some¬
thing will happen or will be the case. And one must distinguish not expecting some¬
thing to happen from expecting it not to happen; and, as in the case of belief and
knowledge, one must identify and analyze separately the various factors that may
permit the consequences of what is expected, even those that follow by known and
accepted principles, to be themselves unexpected. A better analysis of what is
involved in “expecting” might build on some of the remarks in the Philosophical
Investigations 1, 5151 575, 577. Although such an analysis must consider those con¬
sequences of a rule that one may draw “as a matter of course” (ibid., 51 238), it will
surely be inadequate if it limits itself to these, avoiding the implications of the
possibility of Socratic “teaching.”
20. Whatever he may have meant by the term “criterion,” his general practice is
to give criteria for justified but not necessarily correct assertion. There are many
examples. Investigations 1, 51 344: “Our criterion for someone’s saying something to
himself is what he tells us and the rest of his behavior.” Investigations l, 51 269: “Let
us remember that there are certain criteria in a man’s behavior for the fact that he
does not understand a word, that it means nothing to him, that he can do nothing
with it. And criteria for his ‘thinking he understands’, attaching some meaning to the
word, but not the right one. And lastly, criteria for his understanding the word
right.” Investigations /, § § 154—155: “If there has to be anything ‘behind the utterance
of the formula’ it is particular circumstances, which justify me in saying I can go
on—when the formula occurs to me . . . for us it is the circumstances under which
he had such an experience that justify him in saying in such a case that he
understands, that he knows how to go on.” Evidently, it is what a person tells us
and what he does that justifies us in believing that he is saying something to himself;
evidently, there is no necessary connection between satisfaction of these “criteria”
and the fact of his saying something to himself. Similarly, in the other cases. Hence
it is hard to imagine how one could claim that such “criteria” could somehow
exhibit the meaning of such expressions as “he is saying something to himself.”
For an illuminating discussion of this whole matter, see Rogers Albritton, “On
Wittgenstein’s use of the term ‘criterion’,” J. of Philosophy, vol. 61, 1959, pp. 845-
857, reprinted, with an additional note, in G. Pitcher, ed., Wittgenstein: The Phil¬
osophical Investigations, Anchor Books, Garden City, N. Y., 1966.
If there were anyone who held the view that a conscious mental act (of the
reader) is the real criterion for distinguishing reading from not reading, he would
not be much impressed by a selection of examples exhibiting characteristic features,
etc., of cases where a person is said to be reading (possibly erroneously). Rather, he
NOAM CHOMSKY 285

would hold that these examples do, in fact, constitute a “veil of inessential features,”
and that evidence of usage could no more refute his conviction that reading involves
a conscious mental act than citation of characteristic usages of “There is an oasis”
could refute his convictions regarding the conditions under which this assertion is
true and regarding the meaning of “oasis.” The “transitional cases” that Wittgenstein
imagines would be regarded as unclear by such a person, on the grounds that we do
not have convincing evidence as to the presence of the criteria! mental state. We
might say that in “normal cases” criteria for justified usage do serve as criteria for
correct assertion as well, but this is empty unless “normal cases” are something other
than cases in which justified assertions are, furthermore, correct. The long discussion
of the same matter in the Investigations seems to me to shed no further light on
the issues.
21. Or, potentially, their physical realizations, though it must be emphasized
that this is a subject that is not, for the moment, accessible to direct study.
22. The currency of the appellation “behavioral science” for the study of man
and society is another manifestation of the same intellectual tendency in that it in
effect defines this study in terms of its data rather than its natural subject matter.
It is almost as though physics were renamed “science of meter-reading.” What, in
fact, would we expect the natural sciences to amount to in a culture that felt satisfied
with a characterization of them in such terms as these?
IF MATTER COULD TALK
Fritz Machlup

The differences between the natural and the social sciences have been
both exaggerated and minimized. To some, especially Anglo-American
writers, the differences have seemed so categorical that they decided to
appropriate the designation “science” for the natural sciences and to deny
it to the study of social phenomena. Others, especially German writers,
insisted on the scientific character of the study of cultural phenomena but
still held that natural and “cultural” sciences were so fundamentally differ¬
ent that they required “contrary” methodological approaches.
These extreme positions had to be countered; it was important to show
that in most respects, especially regarding the logic of inquiry, cognition,
generalization, verification, and application, there were no fundamental dif¬
ferences between natural and social sciences. Philosophers of science who
applied themselves to this task have, however, in their zeal to correct the
errors of the exaggerators of contrast, sometimes gone too far in minimizing
genuine differences. To recognize these differences may do a great deal
for the comprehension of both the unity and the departmentalization of
science.
This essay is intended to present an issue which has an important bearing
on the difference between the natural and the social sciences. Following
my inclination to dramatize ideas when I want the reader to share my
appraisal of their importance, I shall introduce the issue by means of a
short story or parable.

A PARABLE

They had debated the proposal to telephone the physical laboratories at


Harvard, Princeton, and Chicago and notify their counterparts in these
institutions of their exciting observations; but then they felt unsure and
decided to call a psychiatrist.
“Doctor, please come to the physics laboratory, Columbia University.
A group of seven men—three professors and four assistants—apparently
are suffering from strange hallucinations, although none of us has taken
alcohol, LSD, or any other drugs. We all hear voices. They seem to come
286
FRITZ MACHLUP 287

from inside our machines and apparatuses, in clear English. If we are not
crazy, we are going crazy. Please come immediately.”
When the psychiatrist arrived, he found the physicists engrossed in con¬
versation, not with one another but each with some persons hidden in all
sorts of containers, cabinets, and machines.
“Are you making fun of us? Is this a hoax, or what?” Professor R. spoke
into an apparatus of stainless steel, cylindrical in shape.
“Nothing of the sort,” a voice answered from the inside of the apparatus.
“We simply have decided to end our silence and cooperate with you in your
research work by telling you all we know.”
The professor greeted the psychiatrist and introduced him to his col¬
leagues and assistants. At this point he was called to the telephone. He
returned after several minutes.
“The same thing happened in Princeton. Professor W. was on the phone.
Apparently it started there at the same time as here. At the Forrestal
Laboratory the people panicked after the stellarator started talking. . . .”
He was interrupted by a newcomer. “Someone from the New York Times
called. He wants you to comment on a dispatch from Moscow. There are
two strange headlines in Tass. One says: ‘New Elementary Particles Are
Russian’; the other says: ‘Genes Pass Resolution Siding with Lysenko’.”
Before Professor R. was able to answer, Dr. M., an instructor, entered.
He was excited and, without waiting for his chief’s nod or question, he
began to report on his lab section. He had been talking to a group of under¬
graduates, demonstrating various cases of Brownian motion. As he spoke
about the random walk of molecules and about molecular collisions at
various pressures, someone shouted, “Stop that nonsense!” When he looked
around to see which student had made this impertinent remark, the voice
continued. It was obviously coming from the protective chamber with the
suspended mirror, whose movements were being tracked by the fluctuations
of a reflected light beam. This is what he heard: “It is time that you cease
and desist from misleading your students. What you teach about us mole¬
cules is simply not true. This is no random walk and we are not pushing
one another all over the place. We know where we are going and why. If
you will listen, we shall be glad to tell you.” He had not waited for more,
but had rushed here to report and get Professor R. to witness the event and
to hear what the molecules were about to tell.
“Oh,” said Professor R., “you mean they are going to tell us what they
think they are doing. By all means, let them go ahead.”

A SKETCH OF THE HISTORY OF THE THEME

I shall resist the temptation to spin this yarn further. To do so might


be fun_but each of us can do it in his spare time and make a short story
288 2. STRUCTURE OF SCIENCE

long. The parable has served to pose the issue, that is, to ask what problems
would arise in the natural sciences if inanimate matter began to talk. It
is a fantastic idea, to be sure, but an idea worth exploring. Before I pro¬
ceed, however, I shall acknowledge how it came to me.
The theme—that animals, trees, and inanimate objects could be endowed
with the gift of human speech—Jis, of course, as old as literature. Legends,
fables, and fairy tales are the best-known sources; in Homer’s Iliad we en¬
counter a talking horse, that of Achilles; Aesop’s fables and the tales by
the Grimm brothers and by Hans Christian Andersen are full of talking
and chatting foxes and wolves, trees and flowers, storks and ducks, the
sun and the wind, and teakettles, mirrors, and street lamps. In addition,
there are the stories of Orpheus, who moved rocks and rivers by his songs;
and there have been many anthropomorphic parts in epic and lyric poetry,
in tragedy and in comedy.
As a youngster I delighted in reading books by Carl Ewald; among them
was one with beautiful Tales Told by Mother Nature about talking animals
and objects.1 There was one in which earth and a comet had a discussion,
joined in by the moon; another featured a chat between a spider and a
mouse. A conversation between the sea and various plants and birds oc¬
curred in one tale, and another had a talk, with interesting implications of
conscious cooperation, between a soldier-crab and a sea-anemone. There
was also a most informative debate among five germs: tuberculosis, cholera,
and diptheria complaining about man’s warfare against them, mold bragging
about its great power, and yeast defending man as its best friend.
Much later I became acquainted with the writings of E. B. White and
I fell in love with Charlotte’s Web. But the most philosophical stories of
this genre are in the poems by Christian Morgenstern. The manifesto of the
“West Coasts,” protesting the semantic willfulness of man and declaring
their semantic independence,-’ belongs in the notebook of every language
philosopher. But none of the human talk of these nonhuman beings and
things included, to my knowledge, any allusions to the problem of scientific
procedure.
In methodological and epistemological discussions of the social sciences,
references to a cognate, though inverted, theme can be found: Several
writers have mentioned that the natural sciences lacked two sources of
information—inner experience and verbal communication—which were of
essence in the social sciences. We are familiar with statements by social
scientists reflecting about their advantage—in some measure compensating
for several disadvantages—in having access to data of inner experience un¬
available to natural sciences. Thus Friedrich von Wieser wrote: “We can
observe nature from the outside only, but ourselves also from within. And
since we can do it, why should we not make use of it?”3
The emphasis here was on the scientific observer’s ignorance of how it
feels to be a molecule, an electron, or a gene, contrasted with his knowl-
FRITZ MACHLUP 289

edge of how it feels to be a human being, suffering pain, enjoying pleasures,


and making decisions. There was little emphasis, as far as I know, on the
scientific observer’s inability to interrogate, and receive communications
from, inanimate objects, in contrast with his ability to interrogate, and
listen to verbal reports from, large samples of the members of human so¬
ciety.
Some philosophers of science, to be sure, have likened the controlled
experiments in the physical, chemical, or biological laboratories to “inter¬
rogations” and “cross-examinations.” But, notwithstanding the cleverness
of such metaphors, the observation of physical (chemical, biological)
changes in response to controlled variations in conditions is essentially
different from verbal replies to verbal questions. To watch the change in
the speed with which molecules move as temperature is increased is not
the same thing as to ask them why they are moving faster, and then to
listen to the introspective explanations they might offer in reply—if they
were able to talk.
Whether the fact that the natural scientist does not have to bother with
verbal communications from observed objects was ever emphasized, or even
mentioned, by early writers on the philosophy of science—this I must leave
to the historian of ideas. I do know, however, where I encountered the idea.
It came to me through Alfred Schiitz,4 who in turn gave credit to Hans
Kelsen.5
In his theory of law, Kelsen discussed the problem of contradiction be¬
tween self-interpretation and the analyst’s interpretation of the written con¬
stitution of a state. What should we make of the contentions, stated in
such a document, that the particular state was a federation, a democracy,
a republic, if we find these contentions contradicted by our “objective”
interpretation of many of its substantive provisions? Should we disbelieve
and discard the self-characterization? The same problem appears frequently
in connection with statutory law. Several statutes in the United States, for
example, tell in their preambles that they are enacted to preserve com¬
petition and reduce monopoly, while their actual effect—intended or un¬
witting—is to reduce competition and increase monopoly.
It was this type of contradiction that prompted Kelsen to make a general
observation about the “considerable difference between the subjects of cogni¬
tion in juridical science, and indeed in all social sciences, and the subjects
of cognition in the natural sciences. A rock does not say: I am an animal.”6

THE ISSUE CLEARLY POSED

The implication is clear: If a rock said of itself that it was an animal,


the geologist could not be content with a statement on its chemical com¬
position, physical form and structure, and geological origin; he would also
290 2. STRUCTURE OF SCIENCE

have to explain why the rock was telling something that contradicted the
geologist’s finding. He would have to explain why the rock was wrong, did
not know what it was talking about, or was trying to confuse those who
listened to it.
It is one of the characteristics of the natural sciences that their subjects
of investigation do not talk about themselves. Moreover, the

facts and events [studied by natural scientists] are neither preselected nor pre¬
interpreted; they do not reveal intrinsic relevance structures. . . . The facts, data,
and events with which the natural scientist has to deal are just facts,, data, and
events within his observational field, but this field does not mean anything
to the molecules, atoms, and electrons therein.
But the facts, events, and data before the social scientist are of an entirely
different structure. His observational field, the social world, . . . has a particular
meaning and relevance structure for the human beings living, thinking and
acting therein. They have preselected and preinterpreted this world by a series
of common-sense constructs of the reality of daily life, and it is these thought
objects which determine their behavior, define the goal of their action, the
means available for attaining them. . . . The thought objects constructed by
the social scientists refer to and are founded upon the thought objects con¬
structed by the common-sense thought of man living his everyday life among
his fellow men. Thus, the constructs used by the social scientist are, so to speak,
constructs of the second degree, namely constructs of the constructs made by
the actors on the social scene whose behavior the [social] scientist observes and
tries to explain in accordance with the procedural rules of his science.7

NO DIFFERENCE IN LOGIC

The inherent “meaning structure” of human action prompts Schiitz, as it


did Max Weber, to proclaim the postulate of subjective interpretation.8
This postulate requires the social scientist to ask what model of an individual
mind can be constructed and what typical content must be attributed to it
in order to explain the observed facts as the results of the activity of such a
mind in an understandable relation.9 This does not mean that only one
model of an individual mind would fit the observed facts. Several different
models of various degrees of specificity or generality may be adequate for
the explanation of the same set of observations, so that the social scientist
has the same problem that the natural scientist has of choosing among al¬
ternative hypotheses. Thus, Schiitz’s “postulate” leaves more freedom to
the social scientist than the term may suggest.
Ernest Nagel, however, remains skeptical concerning this postulate. He
concedes that many social scientists seek “to explain such [i.e., social]
phenomena by imputing various ‘subjective’ states to human agents par¬
ticipating in social processes”; but he questions “whether such imputations
involve the use of logical canons which are different from those employed
FRITZ MACHLUP 291

in connection with the imputation of ‘objective’ traits to things in other areas


of inquiry.”10
Precisely what is meant here by “different logical canons”? If Nagel
means no more than that the effort “to ‘understand’ social phenomena in
terms of ‘meaningful’ categories”11 “does not annul the need for objective
evidence, assessed in accordance with logical principles that are common
to all controlled inquiries,”12 he is not in any disagreement with either Weber
or Schiitz. Schiitz, too, calls for “methodological devices for attaining ob¬
jective and verifiable knowledge of a subjective meaning structure”13 and
insists “that the principal differences between the social and the natural
sciences do not have to be looked for in a different logic governing each
branch of knowledge.”14

THE NATURE OF THE DIFFERENCE

Nagel believes that the differences which the Weber school stresses be¬
tween the explanation of social phenomena and that of natural phenomena
lie chiefly in the “personal experience,” “sympathetic imagination,” and
“empathic identification” that are possible for the social scientist and may
aid him in his efforts “to invent suitable hypotheses.”15 But Nagel denies
that these differences are essential as far as the validity of explanatory
hypotheses is concerned. He explicates his position by the following illus¬
tration:

... we can know that a man fleeing from a pursuing crowd that is animated
by hatred toward him is in a state of fear, without our having experienced such
violent fears and hatred or without imaginatively recreating such emotions in
ourselves—just as we can know that the temperature of a piece of wire is
rising because the velocities of its constituent molecules are increasing, without
having to imagine what it is like to be a rapidly moving molecule. In both in¬
stances “internal states” that are not directly observable are imputed to the
objects mentioned in explanation of their behaviors. Accordingly, if we can
rightly claim to know that the individuals do possess the states imputed to them
and that possession of such states tends to produce the specified forms of
behavior, we can do so only on the basis of evidence obtained by observations
of “objective” occurrences—in one case, by observation of overt human be¬
havior (including men’s verbal responses), in the other case, by observation of
purely physical changes. To be sure, there are important differences between
the specific characters of the states imputed in the two cases; in the case of the
human actors the states are psychological or “subjective,” and the social sci¬
entist making the imputation may indeed have first-hand personal experience
of them, but in the case of the wire and other inanimate objects they are not.16

I should like to raise some questions about four points in Nagel’s


formulation:
(1) Our knowledge of the state of “fear” of the fleeing man and of the
292 2. STRUCTURE OF SCIENCE

“hatred” animating his pursuers does, of course, not presuppose that we can
“identify” with the people observed. It does, however, presuppose that we
know what fear and hatred “really” are. We could not know what fear is if
we had never felt it or at least, as some would say, if we had not been told
about it by persons who had.17 The same is true for hatred. The meaning of
these words could never be grasped except on the basis of direct personal
experience or perhaps (but perhaps not) on the basis of verbal communica¬
tions from some who have had such experience.18
(2) When Nagel extends the concepts of “observation of ‘objective’
occurrences” and of “overt human behavior” to include “men’s verbal
responses,” he loses the clue to the problem. That fearing and hating men
can tell us about their fears and hates, whereas molecules cannot tell us
about their slower or faster movements, is a difference, not only “im¬
portant,” as Nagel concedes, but essential enough to justify the postulate of
subjective interpretation a la Weber and Schiitz. For the men talking to us
may deny any fears and hates that we impute to them, or claim that they
are animated by feelings which we fail to impute to them in our objective
interpretation of their actions. Molecules, on the other hand, never con¬
tradict our hypotheses by verbal depositions—except in our parable or
similar pieces of fiction.19
(3) Nagel’s illustration contrasts a concrete observation of a particular
human situation, that is, a single instance of a (poorly bounded) class of
social phenomena, with a well-bound class of physical phenomena re¬
produced thousands of times in thousands of laboratories. We shall come
back to this lack of parallelism when we discuss the difference between
the constructs used in universal laws and those used in reports on particular
events.
(4)
The “important differences” that Nagel recognizes are those be¬
tween physical and psychological states, with the possibility of ‘firsthand
personal experience” of psychological states on the part of the social sci¬
entist. This emphasis is at the expense of even more important differences,
especially that the subjects of inquiry in the social sciences can give us
their opinions about our explanations of social phenomena; that their
opinions may sometimes be helpful, sometimes misleading; that they may be
contradictory, some saying one thing, some another; and that large portions
of several social sciences have as their subject matter verbally stated theories
of social “actors,” or, at least, their interpretations of the actions and in¬
tentions of their “partners.”
It is in reaction to this fourth point, to Nagel’s emphasis on “firsthand
personal experience” (and “empathic identification”), that Schiitz exclaims
that subjective understanding or Verstehen “has nothing to do with intro¬
spection.”20 What Schiitz wants to say here, I suppose, is that “subjective
understanding” goes far beyond introspection and does not always require
FRITZ MACHLUP 293

it. It merely requires the construction of at least one model of the actor
or of the type of actor, that is, an imaginative construction of perceptions,
memories, and preferences that is adequate for explaining (and for pre¬
dicting) the observed behavior or the observed consequences of presumed
behavior.

TALES TOLD BY MOLECULES

Let us go back to the end of our parable, where the molecules, after
denying the story told by the physicist, offered to tell all they knew about
themselves. The lesson of the parable was not that the physicist had never
been a molecule and thus had no introspective knowledge about molecules
but that the tales told by the molecules would become data and problems
for the physicist to deal with. The self-interpretations of the molecules and
their interpretations of the actions and reactions of their fellow molecules
would become integral parts of the scientists’ observational field.
Whether the tales told by inanimate matter would help or hinder the
scientists’ work is difficult to say. New discoveries will sometimes compli¬
cate, mess up, or even destroy the nicest and most widely accepted scientific
models of natural phenomena, and thus increase the “mystery” of nature
for the time being. Yet, in the long run such discoveries may prove to have
been significant steps in the search for “truth.” On the other hand, the
newly discovered facts may turn out to be errors of observation, and the
scientists’ efforts to accommodate them in their theoretical system may have
been sheer waste. In the same sense, any verbal reports mysteriously made
by inanimate matter—on the witness stand, on the psychoanalyst’s couch,
on questionnaires, or in informal interviews—would certainly mess up the
scientists’ systems of ordered knowledge; in the long run, the value of such
reports may prove to be positive or negative. Undoubtedly, most scientists
would prefer not to be bothered by any confessions, true or false, of their
now conveniently silent subjects of observation.
The most irritating disturbances would come from contradictory commu¬
nications. They would raise, among other problems, the question of who,
if anyone, is right, or “more credible.” Assume, for example, that some
molecules explained their movements as part of a well-designed plan of
action, others as emotional reactions to irritations from their fellow
molecules, while a few molecules admitted that they had been pushed
around in random collisions with others. The scientist would probably regard
the few respondents who had the “correct” story as particularly honest and
intelligent molecules. But he would still be confronted with the problem of
explaining why the others were liars or, at least, confused and unreliable
witnesses.21
294 2. STRUCTURE OF SCIENCE

TALES TOLD BY MEN

In all social sciences, theorists, empirical researchers, and practitioners


are greatly hampered by (deliberately or unwittingly) false reports from
men telling about their own actions. However, the question for the social
scientist is not whether the reports received from human actors are helpful
or unhelpful; in many instances such verbal reports are the only data at
their disposal and may be the very subject matter of their investigations.
(This is the case, for example, in economic inquiries about prices, which
are reports from buyers or sellers.)
Even where the communications (from those who take actions which, or
the consequences of which, the social scientist studies) are not the sole
data for inquiry, but where the communications are data supplementary to
a record about physically observable phenomena, even then the social
scientist must not disregard them. He must account for them, whether they
are a help or a nuisance.
Strangely enough, the discussion of these problems has often taken it for
granted that the social scientists, especially those accepting the Weber
position on subjective understanding, regard introspective or communicated
insights as always helpful in their work. Thus, Nagel, questioning the
superiority of “interpretative explanations” in the social sciences, asks,
“Do we really understand more fully and with greater warranted certainty
why an insult tends to produce anger than why a rainbow is produced when
the sun’s rays strike raindrops at a certain angle?”22
The answer probably depends on who “we” are. If we are physicists, the
answer is “no”; if we are persons untrained in physics, the answer is “yes.”
But if we are interested in the philosophy of science, the comparison is
moot. The point is that the very notions of “insult” and “anger” have no
meaning outside the consciousness of those who have been insulted and
angered or who have been told by some who have. At the same time, those
who tell us about insults suffered and anger felt may be trying to mislead
us—and perhaps themselves.
The problems of misleading tales from men engaged in all sorts of
activities is well-known to social scientists. Economists have often com¬
plained about the misinformation received from persons who do the very
things which economic theory tries to explain but who contest the theorists’
explanations. We may recall the perpetual disagreements between prac¬
titioners and theorists of banking. David Ricardo, 150 years ago, spoke
about the directors of the Bank of England who did not understand what
they were doing or what they were talking about; and about dealers in
foreign exchange who reported rates which could not possibly be correct.
Generations of economists have written about generations of commercial
FRITZ MACHLUP 295

bankers who failed to grasp the implications of their actions and often
misinterpreted their own intentions.23 Writers on the theory of the business
firm have repeatedly been criticized by businessmen who disliked the fun¬
damental hypotheses of the theorists and offered contradictory explanations
of business conduct.24

SILENT NATURE VERSUS TALKING MAN: ONLY ONE OF THE


DIFFERENCES

How fortunate, in contrast, are the physicists, say, those in particle


theory: They do not have to put up with denials or contradictions of their
propositions by verbal communications from electrons and positrons. Ima¬
gine how a physicist would react to positrons protesting that they have
unjustly been called “antiparticles,” or to photons denying that they were
“carriers” of the electromagnetic field.
Think of the long faces of biologists if the Tass headline, featured in my
parable, became true and genes really passed a resolution siding with
Lysenko! Or if cells divided in an opinion poll about the differences be¬
tween viruses and microbes. And how disturbing to microbiologists it would
be if a society of cells endorsed the selection of a scientist for the Nobel
Prize and cited with approval his use of an anthropomorphic analogy:
“. . . a cell consists of molecules which must work in harmony. Each
molecule must know what the others are doing.”25 Some microbiologists
might then take heart when they learned that a minority of the cells had
dissented, protesting against anthropomorphism as inappropriate in the ex¬
planation of their interactions.
To be sure, these events—the message received from particles, genes,
cells, etc.—need not at all change any predicted outcomes of actual move¬
ments observed by the scientist. The trouble caused by the messages might
consist only in the extension of the scientist’s task: He would have to explain
the processes behind the misleading messages. On the other hand, some
of the messages might give clues useful in the modification of existing
theories.
Perhaps I am giving too much play to the contrast between silent nature
and talking man. Claims for recognition of several other issues in the dis¬
cussion of differences between natural and social sciences have been made.
Without deciding the relevance and relative importance of the various
issues, and fully recognizing that some of them are closely related and partly
overlapping, I propose to offer a list designed to point up some notable
distinctions. The list will include the question of introspection, although
Schiitz preferred to have it put aside. All the issues refer to the relationship
of the investigator to his subject matter, that is, in the social sciences, to
man, human action, or the effects of human action.
296 2. STRUCTURE OF SCIENCE

The investigator in the social sciences

(1) can feel and think like the men whose actions he investigates;
(2) can talk with other men, learn about their experiences, thoughts, or
feelings, and ascertain that these are similar to his own;
(3) can listen to verbal communications, or read written communica¬
tions, among persons whose actions he investigates, or among per¬
sons of the same type;
(4) can receive verbal communications, solicited or unsolicited, directly
from the persons, or type of persons, whose actions he investigates,
(5) can make mental constructs and models of human thinking and act¬
ing, and can construct theoretical systems involving relationship
among ideal-typical actions, counteractions, and interactions;
(6) can interpret, with the use of his abstract models and theories,
particular (concrete) observations of human conduct;
(7) can interpret, with the use of his abstract models and theories,
particular (concrete) data as results of certain types of action;
(8) cannot build useful constructs and theories in disregard26 of con¬
structs and theories formed and communicated by men of the type
he observes;
(9) cannot obtain useful data (i.e., the “givens” he is supposed to ex¬
plain) except through verbal (and often also numerical) reports
from men engaged in the activities he investigates.

Following Schiitz, I regard point 8 as the most significant. But it is


obviously connected with several other points, especially with point 4. Since
point 4 is most easily comprehended, even by laymen and scientists with
an aphilosophical or antiphilosophical orientation, I have chosen this point
as the one to emphasize and dramatize.

OBSERVATION AND EXPLANATION IN ECONOMICS

My emphasis on the importance, for the invention and acceptance of


theoretical models in the social sciences, of communicated interpretation
of human actions by the actors themselves may give a false impression.
For, alas, these “prescientific” or naive interpretations may be very poor
clues to a satisfactory theory of the network of actions, reactions, and
interactions which the social scientist has to explain. This warning, how¬
ever, should not support the opposite position, namely, that complete
absence of verbal communications from the participants in social actions
would facilitate the construction of a good theory. Indeed, certain institu¬
tions and processes could never be satisfactorily explained by observers of
overt behavior exclusive of men’s verbal responses.
Assume an anthropologist arrives from a populated planet (I do not
FRITZ MACHLUP 297

know whether Mars still qualifies for this designation)—a scholar with a
great gift for observation but without any knowledge of human institutions,
practices, or languages. He sets himself the task of explaining the working
and the function of the stock market. He might observe the traders, jobbers,
messengers, brokers, and customers, their movements, their gestures, and
their shouts for any length of time, but he would not even come close to a
superficial description of the actual process, not to speak of the function
of the institution.
Now endow him with the ability to speak and to understand the language,
and permit him to interview every one of the people engaged in the ac¬
tivities of the stock market. He would end up with information, but he
would not understand enough of what goes on to know the economic func¬
tions of the stock market, particularly its role in the utilization of investible
funds and in the formation of capital. Since probably 999 out of 1000 per¬
sons working on the stock market do not really know what it does and how
it does it, the most diligent observer-plus-interviewer would remain largely
ignorant. Alas, economics cannot be learned either by watching or by inter¬
viewing the people engaged in economic activities. It takes a good deal of
theorizing before one can grasp the complex interrelations in an economic
system. And this theorizing consists mainly in constructing ideal types of
motivated conduct of idealized decision-makers and combining them in
abstract models of interactions.
From time to time attempts have been made, in economic literature, to
do without the fundamental hypothesis of economic theory, that is, without
the assumption that households and firms pursue a definite objective, such
as maximization of satisfaction and profits. For example, it has been pro¬
posed “to start with complete uncertainty and nonmotivation” and rely on
“the principles of biological evolution and natural selection” to explain and
predict the course of economic events.27 The principle of conscious “adapta¬
tion” by firms seeking more profits was to be replaced by a principle
of “adoption” of successful firms by the environment. The survival of the
“viable” firms and the elimination of the nonviable ones were supposed
to be the result of “competition.”
This proposal depends on the assumption of competition; but competition
in markets depends on the desire of human decision-makers to make
profits. Competition among hungry animals for scarce food can be under¬
stood without reference to any “thoughts” expressed by the animals. Com¬
petition among well-nourished men cannot. Of course, competition among
athletes in a sport contest, competition among scholars in intellectual
endeavors, and competition among businessmen in trade and industry are
different matters, each presupposing different motivations. The point is that
the existence of the profit motive must be presupposed to explain com¬
petition in business. If firms in particular lines of activity make good profits,
the emergence of newcomers trying to get a share in the market can be
298 2. STRUCTURE OF SCIENCE

expected only if one assumes that there are men who prefer more money
to less and, therefore, decide to enter the industry that seems to offer
relatively large profits.28
One of the most important phenomena of the social world, inaction or
“negative action” (“intentional refraining from action”), necessarily escapes
sensory observations,29 other than the nonactor’s verbal statement of his
“reasons,” that is to say, a statement of his (perhaps wrong or misleading
and certainly introspective) theory about his way of thinking. Where
inaction is a mass phenomenon, the construction of an ideal type of man
who would “understandably” not react to a particular change in conditions
is required.

UNIVERSAL AND PARTICULAR, THEORY AND HISTORY

One of the worst stumbling blocks in the methodological analysis of the


social sciences was the insistence of many (chiefly German) philosophers of
science on a categorical difference between natural and cultural sciences.
The cultural sciences, they argued, were not “generalizing,” like the natural
sciences, but were, instead, “individualizing” in the sense that their only
concern and interest were individual events at particular times and places.30
For these writers, the social sciences were essentially “history.” Confronted
with the general theoretical system of economics, a foremost representative
of this school of thought stuck to his principles and without hesitation
separated economics from the other social sciences by designating it as a
natural science.31 The cultural sciences were “by definition” concerned only
with historical events.
However widespread this notion was at one time, nowadays it is at best a
chapter in the history of ideas. Philosophers of science, irrespective of their
differences on many issues, are now fully agreed that almost all disciplines
have a core of general propositions, with applicability to concrete situations
or particular cases. This is true of the natural and the social sciences alike.
Of course, application does not mean that the propositions of the discipline
will be sufficient to explain a concrete situation, change, or event (or to
predict actual outcomes or to prescribe for desired outcomes). As a rule,
propositions of several disciplines will have to be brought to bear on ex¬
planations (predictions, prescriptions) in particular cases. No discipline is
self-sufficient when it comes to applications. Incidentally, there is much
division of labor among those professing a discipline, some of them special¬
izing in formulating, reformulating, and disseminating general propositions
—theorists; others on applying them to particular cases—applied scientists
and engineers (including social engineers).
Perhaps a few words should be said about one discipline which is ex¬
clusively concerned with applications of general propositions from other
FRITZ MACHLUP 299

disciplines to particular situations and events: I refer to history. The


historian is an applied sociologist, political scientist, psychologist, social
psychologist, economist, anthropologist, archaeologist, military scientist,
philologist, linguist, physiologist, biologist, chemist, geologist, physicist,
statistician, and what not. Since he deals chiefly with human history, he is
predominantly an applied social scientist and will, where propositions of
natural sciences are relevant to historical research, either rely on generally
known propositions (for example, that certain chemical substances are
deadly poisons) or turn to specialists for advice. The historians who explain
Caesar’s decision to cross the Rubicon and the historians who explain
Roosevelt’s decision to devalue the dollar apply different mixtures of social
sciences, although psychology is a strong ingredient in both.
I have said that almost all disciplines—though not history—have a core
of general propositions with (usually indirect) applicability to concrete
situations or particular cases, and that this is true for natural and social
sciences alike. Yet, strangely enough, when we search modern treatises
on the philosophy of science for illustrations in all sorts of contexts, we
find a consistent inconsistency: The natural sciences are, practically without
exception, illustrated by general laws or by propositions about empirical
regularities, whereas the social sciences are illustrated by particular in¬
stances, singular observations, and historical events. Whatever may have
been responsible for this discrimination in analysis and exposition, it can¬
not help being misleading. Indeed, it has, I believe, led the philosophers
themselves into erroneous positions concerning the very issues we have
been treating in the present essay.
To show what I have in mind I shall present and briefly examine three
propositions, all in the form of questions about price increases:

(1) Why did the United States Steel Corporation raise the prices of
certain steel products in April 1962 by 3 V2 percent?
(2) Why did prices, as measured by the cost-of-living index, rise in the
United States by 7 percent from 1956 to 1958?
(3) Why will prices increase if, with a given labor force, given facilities
of production, and given technological knowledge, total bank credit
is expanded and aggregate spending by government and business in¬
creases?

Only the third question is a problem of economic theory. The first is


chiefly a problem of business history. To answer it, many things besides
economic theory have to be known; indeed, economics may be relatively
irrelevant in explaining why corporate management took the particular
decision. Psychology, sociology, politics, management science, industrial
relations, accounting, and several other disciplines may be involved; a pro¬
fessional economist may, of course, know enough of all these fields to
answer the question without calling in a team of experts from ten other
300 2. STRUCTURE OF SCIENCE

departments. (The reader may want to be reminded that the particular


price decision precipitated a row between the President of the United
States and the President of the United States Steel Corporation.)
The second question is one of historical statistics. Since it involves mass
conduct, that is, decisions and actions of millions of anonymous people
selling and buying thousands of different things, we expect that propositions
of economic theory are of paramount but not of exclusive relevance. The
full explanation calls for knowledge in a variety of fields: political science,
law and diplomacy, military science, logistics, technology, engineering,
trade-union politics, and other arts and sciences. (The reader may have to
be told that military actions in Egypt, the closing of the Suez Canal, the
rerouting of oil shipments, and several other things played significant
roles, besides fiscal, monetary, and labor policies.)
The third question is pure economics, and nothing but economics, be¬
cause it does not refer to any particular event in time and space. It is
answered by reference to general propositions in the form of universal
laws” or fundamental hypotheses. These hypotheses involve constructs of
idealized human action based on (assumed) objectives to maximize profits
and satisfactions. The hypothetical price increases are explained as the
results of certain types of hypothetical actions which, in turn, are understood
in terms of “meanings” on the part of hypothetical human actors of
homunculi made to order to suit the economist’s purposes.32

NAGEL ON PROPOSITIONS OF SOCIAL SCIENCES

I am not sure whether Nagel sees the concepts and theories of the social
sciences in this or in a very different light. For he does not choose for
his illustrations general propositions of social sciences, but rather singular
events involving particular persons at a specified time and place. He states
this most clearly when he discusses Maclver’s example of the man fleeing
from a pursuing crowd and finds that it involves “an assumption, singular in
form, characterizing specified individuals as being in certain psychological
states at indicated times. . . .”33
At one point Nagel discusses a point of economic history: Southern
cotton planters were “unacquainted with the laws of modern soil chemistry,
and mistakenly believed that the use of animal manure would preserve
indefinitely the fertility of the cotton plantation.” He holds that the “social
scientist’s familiarity with those laws” will help him explain the gradual
deterioration of the soil and the consequent need for virgin land to main¬
tain the output of cotton.34 I submit that it is not the “social scientist” who
needs this knowledge of soil chemistry; it is the historian who, in explaining
the events and changes he has selected for investigation, has to know all
sorts of things, including some general laws of physics, chemistry, agronomy,
FRITZ MACHLUP 301

and so forth. If the historian happens to have competence (or a university


degree) in economics or any other social science, this does not make
physics, chemistry, or agronomy a part of social science. The exhaustion
of the soil used in cotton production may be a result of human action
(deficient fertilization), partly explained with the aid of economic theory,
and in turn also a cause of human action (cultivation of additional land),
again in part explained in terms of economics. However, this does not make
the exhaustion of the soil the province of economics. Technology is not a
social science, even if it plays a great role in many classes of phenomena
with which social scientists have to deal. My main point is that concrete
events in history, particular cases in the real world, are rarely, if ever,
explained with the aid of a single discipline but require application of
several fields of knowledge.
In his critical discussion of “meaningful” or “interpretative” explanation
in the social sciences, Nagel tries to show that the imputation of motives
or sentiments to human agents is quite unreliable.

We may identify ourselves in imagination with a trader in wheat, and con¬


jecture what course of conduct we would adopt were we confronted with some
problem requiring decisive action in a fluctuating market for that commodity.
But conjecture is not fact. The sentiments or envisoned plans we may impute
to the trader either may not coincide with those he actually possesses, or even
if they should so coincide may eventuate in conduct on his part quite different
from the course of action we had imagined would be the “reasonable” one to
adopt under the assumed circumstances.35

We may note that in this illustration Nagel again refers to our imagined
identification with a particular trader in wheat, even asks about “the sen¬
timents and envisioned plans” which he actually possesses, and raises ques¬
tions about his actual conduct. Since I may assume that Nagel is not allud¬
ing to the psychoanalysis of a wheat dealer of his acquaintance, but rather
to the methodology of economic analysis, I take the liberty of offering an
interpretation of the “actual” role which “interpretative” explanation has
in economics, and I propose to do this with an illustration involving traders
in wheat.
The economist is concerned with questions of the following kind: How
will the price of wheat be affected by a report of a drought; by a reduction
in the import quota for wheat; by a reduction in the rate of interest; by
an increase in freight rates; by an announcement that the ice cover on the
Great Lakes will delay the opening of shipping for several weeks? These
questions can be answered with the aid of general propositions of economic
theory. The answers do not presuppose that the economist knows any wheat
dealer personally, let alone his psychological make-up. They do presuppose,
however, that the economist has constructed an ideal type of dealer con¬
duct. Its main feature is that dealers would rather make more money than
less. This imputation of the profit motive to anonymous characters—“in-
302 2. STRUCTURE OF SCIENCE

tervening variables” between, say, a newspaper report and a quotation of a


higher price on the wheat exchange—is necessary for a full understanding
of the causal connections.

HUNCHES

I cannot pretend to know why Nagel, like most other philosophers of


science, confines himself, in illustrations from social sciences, to propositions
about concrete events and particular persons. I have a hunch, however,
that the explanation is related to the main issue of this essay: that human
actors can talk about themselves, their actions, and the events they ex¬
perience. If they could not, Nagel would not be able to question his wheat
dealer in order to ascertain whether he “actually possesses” the sentiments
and plans that an economist may have imputed to the dealer or, more
likely, to a hypothetical dealer of a heuristic model.
Perhaps, if molecules could talk, and told about their individual senti¬
ments and plans, the philosopher of science would be tempted to switch
his attention from general to particular propositions about molecular
motions. (Physicists, though, might soon learn to discount the tales told
by molecules.) If genes could talk, philosophers of science would perhaps
emphasize the divergences between the geneticists’ readings of the hereditary
code and the genes’ own translations into English (even if biologists decided
to disregard the confessions of the genes).
One may venture the thought that the development of the computer
has opened an area in which the contrast between silent matter and talking
man may vanish and the procedures of natural and social scientists con¬
verge. Assume for a moment that scientists can observe both the input
and the output of a modern computer but have no access either to the
information storage or to the program tape. Could they explain the be¬
havior of the computer? This is similar to the task of explaining human
behavior without knowing either the memory (information storage) or the
skills and preferences (program tape) of the actors.36
For a primitive explanation of the computer’s behavior, purely empirical
methods (linking frequencies of various kinds of input and output) might
suffice, though one might not have much confidence in the findings. For
a more thorough and more powerful explanation, we would want to con¬
struct models of the (unknown) memory stored in the computer and of
the (unknown) program tape directing its actions. If some philistines
should now rebel against my assumptions and insist that we not recon¬
struct (imagine) but inspect (observe) the memory and the program in
the computer, they would merely reestablish the contrast between natural
and social sciences: After all, there is no way for the social scientist to
“inspect” human memories and programs. He can introspect, he can receive
FRITZ MACHLUP 303

and interpret verbal communications about introspections by others, and,


most importantly, he can construct models of individual minds deemed
adequate for the explanation and prediction of human “output.”

NOTES
1. Carl Ewald, Mutter Natur Erzahlt (Stuttgart: Franckh, 1910); idem, The
Spider and Other Stories (New York: Scribners, 1907); idem, The Old Post and
Other Nature Stories (London: Dent, 1922).
2. Christian Morgenstern, “Die Westkiisten,” in Galgenlieder (Berlin: Bruno
Cassirer, 1926), pp. 42-43.
3. Friedrich von Wieser, “Das Wesen und der Hauptinhalt der theoretischen
Nationalokonomie: Kritische Glossen,” Jahrbuch fur Gesetzgebung, Verwaltung und
Volkswirtschaft im Deutschen Reich, 35. Jahrgang (1911), p. 402.
4. Alfred Schiitz, Der sinnhafte Aufbau der sozialen Welt (Vienna: Springer,
1st ed. 1932, 2nd ed. 1960), pp. 281-282.
5. Hans Kelsen, Allgemeine Staatslehre (Berlin: Springer, 1925), p. 129.
6. Kelsen, loc. cit. Fascinated by this story of the rock, I made the rock expand
its tale: “I came here because I did not like it up there near the glaciers, where I
used to live; here I like it fine, especially this nice view of the valley.” Fritz Machlup,
“Are the Social Sciences Really Inferior?” Southern Economic Journal, Vol. XXVII
(January 1961), pp. 176-177; reprinted in Maurice Natanson, ed., Philosophy of
the Social Sciences (New York: Random House, 1963), p. 166.
7. Alfred Schutz, “Common-sense and Scientific Interpretation of Human Ac¬
tion,” Philosophy and Phenomenological Research, Vol. XIV (September 1953),
p. 3; reprinted in Alfred Schutz, Collected Papers, Vol. I (The Hague: Martinus
Nijhoff, 1962), pp. 5-6.
8. The idea of subjective interpretation—Verstehen—was first advanced by
Wilhelm Dilthey. He, however, confined it to interpretations of history and literature.
Wilhelm Windelband and Heinrich Rickert extended the postulate to the social
sciences, or rather “cultural” sciences, which they, however, regarded as strictly
historical in character. (For citations, see footnote 30, below.) It is Max Weber to
whom we owe the further extension of the principle to generalizing (and predictive)
social sciences. Whether for Weber subjective interpretation was a requirement or
merely an important aid in the analysis of social phenomena is still controversial.
9. Schutz, Collected Papers, Vol. I, p. 43.
10. Ernest Nagel, The Structure of Science (New York: Harcourt Brace, 1961),
p. 481.
11. Ibid.
12. Ibid., p. 485.
13. Schutz, op. cit., p. 36.
14. Alfred Schutz, “Concept and Theory Formation in the Social Sciences,” in
Collected Papers, Vol. I, p. 65.
15. Nagel, op. cit., p. 484.
16. Ibid.
17. I cannot resist recalling the operatic dialogue between young Siegfried, in
Richard Wagner’s music drama, and old Mime: Siegfried asking what fear is and
how one could learn how to fear, and Mime first trying to teach him fear by de¬
scribing his own feelings of anxiety and then, when this proves unsuccessful, prom¬
ising that Siegfried would soon learn it by personal experience when he encounters
Fafner, the dragon.
18. We may know what fear and hatred felt like when we felt them and how we
think we acted at those times; we may also know how other people acted when
they reported fearing and hating and how they described their feelings. We then try
to find a correspondence or similarity—an overlap among the relevant features
common to these sets of private and public observations.
19. I realize that we can build instruments which tell us by means of signals in
English about the physical state of matter. For example, the gauge in my automobile
tells me whether the water in the radiator is “cold” or “hot.” If the gauge is out of
order, the “report” may be wrong. Yet, we would never say that the water was
“lying” about its temperature. It is not the water that tells us about its feeling cold
or hot; the gauge gives us signals by means of a mechanism which man has invented,
built, and installed.
304 2. STRUCTURE OF SCIENCE

In this example, as Karl Deutsch called to my attention, there is a gap between


the report by the gauge and the response by the driver. Gaps of this sort can some¬
times be bridged. In the human body, signals are often coupled to a response '^out
the intervention of consciousness, as for example by various feedback mechanisms,
studied by neurophysiologists. Analogous mechanisms are designed by man: sen-
steering apparatuses. The difference between automatic and conscious responses in
the case of human behavior is, I believe, relevant to the scientific procedures in
different behavioral sciences. The “strictly behavioral” scientist studies unconscious
reactions of the body; in contradistinction, the “social” scientist studies conscious
reactions of man to signals received from his environment, including actions o
other persons.
20. Schutz, op. cit., p. 56. , , , , .
21 The first step of a scientist confronted with contradictory and dubious con¬
fessions (by hitherto silent matter) would be to ascertain how relevant the different
motivations reported are for the actual movements observed. He may find that
several different confessions would account for the same movements (under the
same conditions). In this case he might have no prima facie reason for preferring
one “subjective explanation” to another. The differences could become more sig¬
nificant as his range of experimental findings expands and yields critical data allow-
ing or requiring him to exclude one or more of the previously eligible explanations.
In any case, however, he would have to search for explanations of the contradictory
“subjective” explanations. The record of the contradictory reports presents problems
which call for investigation.
22. Nagel, op. cit., p. 483. .
23. That commercial banks “create” credit and money is now known to prac¬
tically all sophomores studying elementary economics and is fully recognized by the
official authorities reporting statistics on the supply of money. Yet, the majority of
commercial bankers have stubbornly denied it in interviews, public speeches, and
print. The economist can explain the failure of the bankers to form a correct image
of their actions and of the consequences of their actions: The banker receives de¬
posits from customers, which adds to his reserves; he grants loans to customers,
who then draw on the bank, which will reduce his reserves; thus he cannot lend
more than he has received. What the banker does not realize and cannot observe is
that many of the deposits he receives are from persons who had received payments
from those who had obtained loans from other banks and even from himself. Thus,
the banker does not know what he really does or brings about because he cannot
observe it. He may, of course, learn it from economists. But his uninstructed opinion
—and frequently also his opinion unshaken by attempted instruction—contradicts the
economists’ theories.
24. This refers to the assumption that the firm attempts to maximize profits. At
the bottom of the controversy, in which so-called business economists and profes¬
sors of management science often take the side of the businessman contradicting the
economic theorist, lies the confusion between the “firm” as an organization—a group
of persons with a variety of objectives, somehow coordinated—and the “firm” as
a pure construct in the analytical role of an intervening variable in the theory of
prices, inputs, and outputs.
25. Andre Lwoff, “Interaction among Virus, Cell, and Organism” (Lecture de¬
livered in Stockholm upon receiving the Nobel Prize in Physiology), Science, Vol.
152 (27 May 1966), p. 1216.
26. While he may not completely disregard constructs and theories communicated
by the subjects, he may contradict them for adequate reasons.
27. Armen A. Alchian, “Uncertainty, Evolution, and Economic Theory,” Journal
of Political Economy, Vol. LVII (June 1950), pp. 221 and 211.
28. Edith Penrose, “Biological Analogies in the Theory of the Firm,” American
Economic Review, Vol. XLII (Dec. 1952), pp. 804-819 (esp. pp. 809-816); idem,
“Rejoinder,” Vol. XLIII (Sept. 1953), pp. 603-609.
29. Schutz, Collected Papers, Vol. I, p. 54.
30. Among the major representatives of the categorical differentiation between
generalizing and individualizing sciences were Wilhelm Dilthey, Einleitung in die
Geisteswissenschaften (Leipzig: Duncker & Humblot, 1883); Wilhelm Windelband,
Prdludien (Tubingen and Leipzig: Mohr, 1903); and Heinrich Rickert, Die Grenzen
der naturwissenschaftlichen Begriffsbildung (Tubingen Mohr, 1902, 2nd ed. 1913).
See footnote 8, above.
31. Rickert, op. cit., p. 224.
32. Most economists are satisfied that some people—in sufficient number to be
significant—really act in ways similar to the programmed decision-making by the
FRITZ MACHLUP 305

homunculi. But there are also economists who do not care about even that much
correspondence between real and imagined men, as long as the conclusions that can
be derived from conjunctions between the constructed types and certain sets of
specified conditions broadly correspond to the observed records of events that have
actually occurred after conditions of the specified sort have actually existed.
33. Nagel, op. cit., p. 482. The emphasis is mine.
34. Ibid., p. 476.
35. Ibid., p. 483.
36. I am indebted to Karl Deutsch for a stimulating discussion of these points.
FUNCTIONALISM IN SOCIAL
ANTHROPOLOGY
Maurice Mandelbaum

Few philosophers of science in our generation have been as scrupulous


as Ernest Nagel in taking into account the actual methods and the actual
results of the sciences, and probably none has ranged so widely over the
whole territory of contemporary scientific thought. These characteristics
of his work, as well as its clarity, have given him unique distinction and
have placed many scientists and almost all contemporary American phi¬
losophers deeply in his debt. To be asked to join in honoring him is itself
an honor. Nonetheless, in what follows it will be my aim to suggest that
in one case the model which he has given us has been misleading, and unlike
his other analyses has departed too widely from the methods and goals of
those whose works it sought to explicate. To use the present occasion to
bring forward this suggestion is not, however, perverse: One cannot have
had contact with Ernest Nagel, nor with his work, without appreciating the
extent to which, under all circumstances, he has sought to clarify issues
and to do justice to theories, positions, and problems, following where
the facts lead.
What I wish to propose is that the actual historical movement in social
anthropology which has been called “functionalism” should be treated
independently of questions concerning functional explanations in the
biological sciences, and independently of issues concerning teleological
explanations. This proposal involves a departure from Nagel’s position. In
“A Formalization of Functionalism,” and later in The Structure of Science,
his analysis of functionalism leaned very heavily upon the analysis he had
given of functional explanations in biology; and that analysis he first pro¬
posed in an article entitled “Teleological Explanation and Teleological
Systems.”1 In this respect Nagel’s work has been typical of most of the
work done by philosophers who have been concerned with functionalism
in the social sciences.2
It is of course true that those who sought to establish the functionalist
position in social anthropology frequently called attention to parallels with
biology, and similar statements are to be found among those who are gen¬
erally identified with a functional position in sociology.3 More specifically,
it must be admitted that some of the biological parallels which have been
drawn by anthropologists and sociologists do suggest a connection between
306
MAURICE MANDELBAUM 307

functionalism in the social sciences and the acceptance of an organismic,


or holistic, approach in biology. Nevertheless, not all aspects of functional¬
ism draw on such sources. In the relevant literature many of the statements
which suggest parallels between social processes and biological phenomena
do so with respect to problems of adaptation and survival; however, a
concern with these topics is assuredly not confined to those who adopt a
holistic position in biology. Nor should such interests be interpreted as
demanding that functional explanations be considered as examples of
teleological explanations: It would surely be stretching the concept of tel¬
eology in a most unwarranted fashion to consider Darwin’s theory of
the origin of species as an example of a teleological theory.4 Under these
circumstances, it seems to me worthwhile to go back to two of the primary
sources of functionalism, the theories of Malinowski and of Radcliffe-
Brown, in order to see whether current discussions of functional explana¬
tions really conform to the types of understanding which were originally
sought by those who looked upon functionalism as a new and more
promising approach in the field of social anthropology.

THE FUNCTIONALISM OF MALINOWSKI'1

In Malinowski’s earlier writings one does not find that “functionalism”


is used as the name for a specific scientific theory of culture. Yet, as early
as Argonauts of the Western Pacific (1922), a quite definite theory of cul¬
ture was clearly implicit in Malinowski’s method, and he was aware of this
fact. When, in the first chapter of that work, he was describing his own
investigative procedures, he stated it as his view that “an ethnographer
who starts out to study only religion, or only technology, or only social
organization cuts out an artificial field for inquiry.”6 And in the last
chapter of the same work this point of view was made perfectly explicit,
and Malinowski contrasted his own views with those of the previously
dominant schools of ethnography. He wrote:

We have seen that this institution [the Kula] presents several aspects closely in¬
tertwined and influencing one another. To take only two, economic enterprise
and magical ritual form one inseparable whole, the forces of the magical belief
and the efforts of man moulding and influencing one another. . . .
It seems to me that a deeper analysis and comparison of the manner in which
two aspects of culture functionally depend on one another might afford some
interesting material for theoretical reflection. Indeed, it seems to me that there
is room for a new type of theory?

The particular theories with which Malinowski then went on to contrast


his own view were the evolutional studies of anthropologists such as Tylor,
Frazer, and Westermarck; studies of cultural influences by means of con¬
tact and diffusion, as represented by Graebner, Schmidt, Rivers, and El-
308 2. STRUCTURE OF SCIENCE

liott-Smith, among others; and studies, such as those of Ratzel, concerning


the influence of the environment on institutions. It was Malinowski’s con¬
viction—and in this he was surely correct—that in the future a much
greater role would be played by theoretical studies which, unlike those of
his predecessors, took as their field of inquiry the influence on one another
of the various aspects of an institution, the study of the social and psy¬
chological mechanism on which the institution is based. 8 It is this view
which I shall term the early functionalism of Malinowski, a view which
defined both a method and a theoretical position.
One can readily see the connection which existed between this theoretical
position and Malinowski’s very insistent rejection of historical consider¬
ations in anthropology. When the interdependence of the various aspects
of a culture is stressed, and when it is claimed that any attempt to under¬
stand these aspects singly constitutes a misleading form of abstractionism,
then the diffusionist method of tracing the migration of specific culture-
traits must be rejected as inadequate: An understanding of any trait (or
any complex of traits) is to be derived from understanding its functioning
within its own particular context, rather than from tracing its migrations.
Similarly, the attempt to construct an evolutional history of specific in¬
stitutions would be misleading: It is the nature and functioning of these
institutions in their actual contexts, and not their place in a linear temporal
series, that is of ethnological significance. Thus one can see that the theory
of functional interdependence was in itself sufficient to lead to Malinowski’s
rejection of a historical approach in his actual field work and in his theoret¬
ical orientation.9
That a stress on the interdependence of institutions within any given
social context was in fact the basic postulate of Malinowski’s early functional¬
ism can be documented through the testimony of anthropologists who were,
at the time, deeply influenced by him. For example, Raymond Firth’s
Primitive Economics of the New Zealand Maori (1929), which was
dedicated to Malinowski and at numerous points gratefully acknowledged
his influence, used this functional approach and explicitly rejected a separa¬
tion of specific economic practices from their environing conditions. The
same assumption was stressed—though perhaps less obviously—in H. I.
Hogbin’s Law and Order in Polynesia (1934), which was also closely con¬
nected with Malinowski’s views, and for which the latter wrote a lengthy
preface. For example, in his concluding summary paragraph, Hogbin in¬
sisted that there was danger in “isolating single aspects of culture from
their context,” and he stressed the view that Polynesian societies are to be
regarded “as organic structures in which all the parts are interrelated.”
Furthermore, one finds in Gregory Bateson’s Naven (1936) the following
opening sentence: “If it were possible adequately to present the whole of a
culture, stressing every aspect exactly as it is stressed in the culture itself,
no single detail would appear bizarre or strange or arbitrary to the
MAURICE MANDELBAUM 309

reader. . . Bateson then ascribes this point of view to “Radcliffe-Brown,


Malinowski, and the Functional School”: The “Functional School” is then
characterized as having set itself the task of describing “in analytic, cog¬
nitive terms the whole interlocking—almost living—nexus which is a
culture” (p.lf.).10
Bateson’s treatment of functionalist views is enlightening in several
respects, marking the confluence of a variety of differing but related func¬
tionalist conceptions. In the first place, as has just been noted, he was aware
of what I have termed the early functionalism of Malinowski, taking this
thesis as the distinguishing mark of all who might be said to comprise the
Functional School. (As we shall see, Bateson was correct in regarding this
view as also characteristic of Radcliffe-Brown.) In the second place, how¬
ever, Bateson’s own conception of the ethos and the eidos of a culture
marked a step beyond the position adopted by Malinowski, a step which
took him toward a view of cultural unity which one might designate as
“organismic.” With reference to this aspect of his thought, Bateson acknowl¬
edged the importance he attached to that specific form of a functionalist
position which was adopted by Ruth Benedict.11 He also apparently recog¬
nized that Malinowski’s stress on specific institutions was at odds with an
organismic interpretation of the unity of culture (cf. p. 27); and in this, as
we shall see, he was correct. In the third place, Bateson noted that Malinow¬
ski’s own thought had begun to shift (and in this context he quoted from
the article “Culture” in the Encyclopedia of the Social Sciences): He
pointed out that Malinowski was coming to use the concept of “function”
primarily with reference to the functioning of institutions in satisfying
specific human needs.12 In this connection Bateson indicated (quite rightly)
that there was a fundamental ambiguity in the way in which the term
“function” was being used (cf. pp. 26-27). Finally, we may note that
Bateson also recognized that the primary stress of Radcliffe-Brown’s func¬
tionalism was not merely an emphasis on the interdependence of the
elements within a society but upon the contribution which each of these
elements makes to the solidarity and integration of the group (cf. p. 29).
A view similar to that of Radcliffe-Brown is present in Bateson himself,
for he tended to lay stress upon the function performed by the elements
in a culture in maintaining states of equilibrium (cf. pp. 108-109).
To pass from the first to the second of the points just singled out for
attention in connection with Bateson’s modification of Malinowski’s own
earlier views, we may note that in spite of an emphasis on the interde¬
pendence of the elements within a culture, Malinowski never looked upon a
culture (or upon a society) as possessing a complete unity. The degree of
pluralism inherent in his system is to be seen (at one level) in the emphasis
which he placed on individuality among the members of any society. In his
criticism of the thought of Durkheim, for example, and most clearly in the
major thesis of his Crime and Custom, Malinowski explicitly rejected the
310 2. STRUCTURE OF SCIENCE

view that the members of a primitive society should be viewed as consti¬


tuting a single group which was homogeneous in attitudes and in behavior.
In this respect he differed profoundly from Ruth Benedict s position, which
had noticeably influenced Bateson. And he differed from her also in not look¬
ing upon a culture as a single whole. The focus of Malinowski s attention was
always upon institutions, or upon what might better be called institutional
complexes, in which human beings carried on multiform, interrelated activ¬
ities. To be sure, these institutional complexes were intimately connected
with one another in any culture, but Malinowski never (so far as I am
aware) spoke in a way which would lead one to assume that he thought
of a culture as being something different from these interrelated parts. And
the focus of his attention was always on the parts.14 Put more generally, the
interrelationships of the various aspects of a culture were never of a sort
which would (on Malinowski’s view) lead one to speak of the culture as a
single, supervenient entity different from its various, specific institutional
aspects. In contrast, however, one may note that it was precisely such a view,
which can appropriately be called “an organismic view,” that Ruth Benedict
espoused. One typical theoretical statement from her Patterns of Culture
runs as follows:
The whole, as modern science is insisting in many fields, is not merely the sum
of all its parts, but the result of a unique arrangement and interrelation of parts
that has brought about a new entity. . . . Cultures, likewise, are more than the
sum of their traits. We may know all about the distribution of a tribe s form
of marriage, ritual dances, and puberty initiations, and yet understand nothing
of the culture as a whole which has used these elements to its own purpose.15

In contrast to this position we may note that in Malinowski’s articles in


the Encyclopaedia Britannica (“Anthropology” in the thirteenth edition,
1926; “Social Anthropology” in the fourteenth edition, 1929) his applica¬
tion of what he defines as the functional approach proceeds through analyses
of the functioning of what I have called institutional complexes, i.e., “Mar¬
riage and the Family,” “Economic Organization,” “The Supernatural,” and
“Primitive Knowledge” (viz. Language and Mythology). Furthermore, of
course, the series of his studies of Trobriand life proceeds along essentially
similar lines. However, we need not merely infer an opposition between
Ruth Benedict’s theoretical presuppositions and those of Malinowski con¬
cerning what constitutes a proper method for anthropological study.
In the first place, in his article “Culture” in the Encyclopedia of the So¬
cial Sciences (1931), Malinowski was perfectly explicit as to how a func¬
tional analysis of culture should proceed. As the following quotations from
that article show, his insistence on the interrelationships of activities within
a cultural context (an insistence directed against diffusionists, evolutionists,
and those whom he in other places accused of “antiquarian” interests) did
not lead him to deny the existence of institutional components, or units,
within societies. On his view, functional anthropology is an analytic dis-
MAURICE MANDELBAUM 311

cipline, which seeks to understand cultures through the ways in which their
true components function, and these components are institutional in nature.
This can be illustrated through a series of excerpts from that article, in
which Malinowski’s reliance on analysis and elements should be clear:
Culture is a well-organized unity divided into two fundamental aspects—a
body of artifacts and a system of customs—but also obviously into further
subdivisions and units. The analysis of culture into its component elements, the
relation of these elements to one another and their relation to the needs of
the organism, to the environment and to the universally acknowledged human
ends which they subserve are important problems of anthropology (p. 623 b).
The primary concern of functional anthropology is with the function of in¬
stitutions, customs, implements and ideas. It holds that the cultural process is
subject to laws and that the laws are to be found in the function of the real
elements of culture. The atomizing or isolating treatment of culture traits is
regarded as sterile, because the significance of culture consists in the relation
between its elements, and the existence of accidental or fortuitous culture com¬
plexes is not admitted (p. 625 a).
The real component units of cultures which have a considerable degree of
permanence, universality and independence are the organized systems of human
activities called institutions. Every institution centers around a fundamental
need, permanently unites a group of people in a cooperative task and has a
particular body of doctrine and its technique of craft (p. 626 a).
In the second place, we may note that in addition to Malinowski’s in¬
sistence on the existence of genuine elements within a culture, the fore¬
going passages make clear another fundamental difference between his
views and those of Ruth Benedict. (This difference brings us to the third
point in Bateson’s account of the variant meanings of the term “function.”)
In these passages Malinowski speaks of “human needs” and of “universally
acknowledged human ends.” Ruth Benedict explicitly rejected Malinowski’s
views at precisely this point: She denied the existence of any set of human
needs which is the same in all cultures, or—to put the same point in an¬
other way—she denied that there are any psychological invariants under¬
lying the forms of organization in different cultures.16 Malinowski’s interest
in such invariants presumably antedated the publication of his article on
“Culture” in the Encyclopedia of the Social Sciences. For example, it was
at least implicit in his willingness to generalize concerning primitive man
in his 1925 essay “Magic, Science, and Religion,” and it was assuredly
present in his treatment of instinct in Part IV of Sex and Repression in
Savage Society, which dates from 1929. However, it was his 1931 article
on “Culture” that introduced the question of fundamental human needs as
an essential aspect of his position, and it is this emphasis to which we shall
refer in speaking of Malinowski’s later functionalism.17
This later functionalist position is best summarized in a 1939 essay
entitled “The Functional Theory,” posthumously published in A Scientific
Theory of Culture and also in the longer theoretical monograph which
312 2. STRUCTURE OF SCIENCE

gave that volume its name. What characterized Malinowski’s later position,
as distinguished from his earlier statements of his views, is an insistence
on relating institutions, which he always regarded as the genuine elements in
a culture, to human needs. Among these needs there were some which
Malinowski regarded as basic, and he held that most, though not all, of
man’s basic needs were rooted in biological factors. In addition to these
basic needs, Malinowski held that a set of derived needs always arises
through the operation of cultural factors. The general nature of these
derived needs (though not the specific ways in which they are satisfied) is,
according to Malinowski, the same in all cultures: As a consequence, one
can find common characteristics in all societies. And it was with these com¬
mon features, and their relations to human needs, that Malinowski’s later
functionalism was concerned. The details of his theory need not here occupy
us. What is important to understand is that there is no necessary incom-
patability between Malinowski’s later focus of interest and his earlier form
of functionalist theory. Thus, it would be mistaken to assume that Malinow¬
ski’s later functionalism superseded or altered his earlier position. Rather,
it operated on a new level of theory, and it can be either accepted or re¬
jected independently of one’s acceptance or rejection of Malinowski’s
original position.
In order to see that there is no inconsistency between Malinowski’s
earlier and later views, one need merely recognize that he always viewed
those institutional complexes which he took to be the genuine elements
within culture as being characterizable in terms of reciprocal relationships
between individuals. The notion of reciprocity was fundamental in Malinow¬
ski’s analysis of gifts within the Kula cycle (Argonauts of the Western
Pacific)-, furthermore, in his introduction to Hogbin’s Law and Order in
Polynesia, the same notion of obligations and counterobligations was taken
to be basic for an analysis of primitive law. In fact, in that preface (p.
xxxiii), Malinowski came close to defining the concept of “an institution”
in terms of reciprocal relationships of obligation. He said: “From the point
of view on which it is necessary to focus our attention at present an in¬
stitution is nothing but a network, a closely-knit system of rules which
define the mutual behaviour of partners in marriage, in parenthood, in
kinship, clanship, economic co-operation, and so on.”18 Granted this
position, it is easy to see that the institutions which are present in any
society will tend to form a coherent and interlocking pattern of relation¬
ships, since if this were not in general true the individuals within that
society would be caught in a trap of mutually inconsistent reciprocal ob¬
ligations : They would have obligations which they could not fulfill, and one
or the other of the mutually incompatible institutions simply could not
survive without substantial alterations.19 Once we recognize that Malinowski
regarded institutions in this way, it is apparent that there need be no conflict
between the emphases which one finds in his earlier and his later forms
MAURICE MANDELBAUM 313

of functionalism: The reciprocal relations which are institutions could be


precisely those forms of relationship which fulfill a particular set of basic
human needs. And this was, of course, Malinowski’s position.
However, it is also possible that there should be no single set of basic
needs which are constant, in the sense of being unaffected by the culture.
In that case the reciprocal obligations, or systems of rules, in any one cul¬
ture might differ substantially from those to be found in any other. In
general, this might be said to have been Ruth Benedict’s position. To be
sure, she would not have denied that whatever set of rules did exist in any
society would have to result in at least a minimal satisfaction (for a signi¬
ficant proportion of that population) of such basic needs as those for food,
shelter, security, etc.: If a society failed to provide such satisfaction of needs
it would, of course, simply cease to exist. And Malinowski sometimes made
use of this obvious fact in arguing for his position.20 However, such an
argument is not only too weak to establish the types of cross-cultural in¬
variance which he sought to establish; as we shall later see, it does not
serve to define any position that can appropriately be designated as “func¬
tionalism.” What Malinowski actually attempted to establish in his later
functionalism was much stronger: He held that there are definite types of
rules to regulate reciprocal obligations in all cultures, and that these types
of rules derive from a specific set of basic needs which are common to all
men, regardless of variations in their cultural inheritance. It was through
the existence of such common basic needs that Malinowski, in his later
work especially, sought to account for the types of rules which he held to
be present in every society. For example, he sought to explain the negative,
proscriptive rules which are directed against infidelity after marriage in
terms of a basic need for sexual satisfaction in the marriage relationship;
he also sought to explain the reciprocal obligations demanded of kinsmen
and clansmen in terms of sentiments, such as those of friendship, which
developed in social interaction.21 As Malinowski made clear in his Scientific
Theory of Culture, what he took to be the fundamental structure present in
any society rested on a basis of common human needs. And it was because
of the universality of these needs that he believed it possible to present a
general theory of culture, which was applicable to all specific cultures,
however variant their superficial qualities might seem to be to those who
did not offer functionalist analyses of them.
We come now to the fourth characteristic isolated by Bateson as a
variant use to which the concept of functionalism was applicable: the view
that in any culture each element must be examined in the light of what it
contributes to the solidarity, integration, and survival of the group. At this
point the theory of functionalism leaves aside consideration of how various
institutions satisfy specific human needs, and considers these institutions in
terms of the contributions which they make to the stability and survival of
the group as a whole.
314 2. STRUCTURE OF SCIENCE

There is, of course, a possible connection between functionalism as thus


understood and what we have termed Malinowski’s early functionalism.
The former position stressed the interconnections existing among the various
institutional elements within a culture, and such a position is wholly com¬
patible with viewing the functions of these institutions as contributory to
the persistence of the group as a. stable whole.22 And it was (as we shall
see) an emphasis upon this connection between the two theses that was
definitive of Radcliffe-Brown’s position. Or, put in another way, Bateson
was correct in viewing Radcliffe-Brown as sharing Malinowski’s view that
all aspects of a culture were interconnected, and he was also correct in
seeing that it was Radcliffe-Brown, and not Malinowski, who laid primary
stress on the notion that a society as a whole was to be understood as
tending to maintain itself as a stable, self-equilibriating system. This stress
was not compatible with Malinowski’s insistence that culture was to be
understood primarily as a means of satisfying the biological and psycholog¬
ical needs of individuals, and the crucial difference between the functional¬
ism of Malinowski and that of Radcliffe-Brown lies precisely here. Rad¬
cliffe-Brown rejected Malinowski’s attempt to connect social institutions
with individual needs, and stressed instead the notion that a society has
needs of its own to which its various institutional elements are to be related
and through which they are to be understood. While Malinowski did not so
frequently criticize Radcliffe-Brown as the latter criticized him, there is
clear evidence that he too recognized where the difference between their
two theories really lay. As he said in A Scientific Theory of Culture:

Functionalism would not be so functional after all, unless it could define the
concept of function not merely by such glib expressions as “the contribution
which a partial activity makes to the total activity of which it is a part,” but by
a much more definite and concrete reference to what actually occurs and what
can be observed. As we shall see, such a definition is provided by showing that
human institutions, as well as partial activities within these, are related to
primary, that is, biological, or derived, that is, cultural needs. Function means,
therefore, always the satisfaction of a need.23

One finds only rare occasions on which Malinowki is willing to speak in


terms of what might be considered as the needs of a social group as a
whole,24 and in this, as we shall see, he is fundamentally at odds with
Radcliffe-Brown.

THE FUNCTIONALISM OF RADCLIFFE-BROWN

In 1922, the year in which Malinowski’s Argonauts of the Western Pacific


appeared, Radcliffe-Brown published The Andaman Islanders. In that work
—which had been written some years earlier—he set forth the main con¬
tentions of his functionalist position, and they remained substantially
MAURICE MANDELBAUM 315

unaltered throughout his career.25 His basic theory had two main character¬
istics, in one of which he was wholly in agreement with Malinowski, and
in the other not. This may be shown by separating into two parts a single
passage in which he stated the conclusions to be drawn from his extended
interpretation of ceremonial customs among the Andaman Islanders.
The passage begins as follows:

It is time to bring the argument to a conclusion. It should now, I hope, be


evident that the ceremonial customs of the Andaman Islands form a closely
connected system, and that we cannot understand their meaning if we only
consider each one by itself, but must study the whole system to arrive at an
interpretation. This in itself I regard as a most important conclusion, for it
justifies the contention that we must substitute for the old method of dealing
with the customs of a primitive people—the comparative method by which
isolated customs were brought together and conclusions drawn from their
similarity—a new method by which all the institutions of one society or social
type are studied together so as to exhibit their intimate relations as parts of
an organic system.26
Here it is obvious that Radcliffe-Brown is in accord with Malinowski; each
stressed the interrelationships among the various aspects of a culture, and
on this basis each rejected the current comparative methods, whether
evolutionary or “antiquarian” (to borrow Malinowski’s phrase).
Furthermore, Radcliffe-Brown was no less skeptical than Malinowski
concerning the scientific value of attempts to trace the origins of elements
in a culture: Both maintained that the relevant question to be asked con¬
cerning any element in a culture was not one concerning its origins, but
one directed toward establishing the role which that element played in the
functioning of the society in which it was found.27
However, even though Malinowski and Radcliffe-Brown laid equal stress
on the principle of interrelatedness as applied to the elements of a culture,
there was an important difference between them with respect to the uses
to which they put this principle. In his earlier work, Malinowski regarded
the principle as being explanatory in character. It was his view that the
concrete nature of each element in a culture was determined by its inter¬
relationships with the other activities carried on by the same individuals in
satisfying their wants; therefore, in following out the interconnections among
these various activities, one was in fact explaining why the culture had the
form which it did. In Radcliffe-Brown’s functionalism, however, the prin¬
ciple of interrelatedness was not regarded as having explanatory power.
In order to understand the elements in a society it was not sufficient to view
them in their relationships to one another: They served as mechanisms
whereby the society as a whole maintained itself, and each element was to
be understood through examining the way in which it enabled the system
of which it was a part to function as a stable and continuing whole. As we
shall see, it was precisely this principle of explanation which Radcliffe-
Brown identified with the notion of function.28
316 2. STRUCTURE OF SCIENCE

That there was at this point a difference between the theoretical position
of Radcliffe-Brown and the early functionalism of Malinowski can be illus¬
trated through the succeeding sentences of the passage which I am citing,
for Radcliffe-Brown continues:

I have tried to show that the ceremonial customs are the means by which the
society acts upon its individual members and keeps alive in their minds a
certain system of sentiments. Without the ceremonial those sentiments would not
exist, and without them the social organization in its actual form could not exist.

It should by now be obvious that in his analyses of a culture Malinowski did


not tend to speak in this way. He did not treat a society as a quasi-entity
which was to be differentiated from the activities of its members and
which was capable of acting upon them through particular ceremonial
customs, or the like. Given such passages in the writings of Radcliffe-
Brown, it is clear why Malinowski linked the theory of Radcliffe-Brown
with that of Durkheim, and contrasted it with his own.29
Legitimate as such a contrast may be, it would be mistaken to suppose
that Radcliffe-Brown rejected an appeal to individual, psychological factors
in offering his explanations concerning the nature and functioning of social
systems. This mistake has frequently been made because of the ways in
which Malinowski and Radcliffe-Brown expressed their opposition to one
another’s theories, and because Malinowski’s thought is so often identified
with his later psychologically oriented functionalism. Even though it is true
that Radcliffe-Brown attempted to draw a sharp line of demarcation be¬
tween “psychology” and “social science,” he at no point accepted the view
that one could establish laws of society which were not dependent upon
psychological mechanisms for their operation. For example, in the passage
just quoted—and indeed throughout The Andaman Islanders—the psy¬
chological concept of sentiments plays a crucial role.30 This concept, made
familiar by McDougall and by Shand, provided the sole link by means of
which Radcliffe-Brown connected the existence and nature of ceremonial
rites with the needs of society.31 While most of his later investigations and
theoretical discussions do not place an equally heavy emphasis on the specific
concept of sentiments, in such studies as “The Mother’s Brother in South
Africa” (1924) and “On Joking Relationships” (1940), Radcliffe-Brown
continued to appeal to psychological mechanisms when analyzing the man¬
ner in which particular social arrangements were able to function in pre¬
serving a social system. And in his Frazer lecture, “Taboo” (1939), he
stated his conviction on this matter quite explicitly. He said: “Clearly it
is impossible to discuss the social function of a rite without taking into
account its usual or average psychological effect.”32 Furthermore, in an
essay written as late as 1945, he again affirmed the view that sentiments are
the psychological mechanisms through which ceremonial rites exercise their
social function. In that essay, he said:
MAURICE MANDELBAUM 317

Rites can be seen to be the regulated symbolic expression of certain sentiments.


Rites can therefore be shown to have a specific social function when, and to
the extent that, they have for their effect to regulate, maintain and transmit
from one generation to another sentiments on which the constitution of society
depends.33

Given, then, the fact that Radcliffe-Brown deliberately used psycho¬


logical concepts in offering functional explanations of the elements in a
society, on what basis could he claim that a sharp line was to be drawn
between psychology as a science and what he termed “social science”? The
work in which one finds the most explicit treatment of this issue is A Nat¬
ural Science of Society, which constituted a recording of discussions held
at the University of Chicago in 1937. In them he presented the view that all
theoretic natural sciences deal with systems, and that

the social scientist and the psychologist are not concerned with the same system
and its set of relations. The social scientist is concerned with relations he can
discover between acts of diverse individuals; the psychologist with relations
between acts of behavior of one and the same individual.34
According to this view, the raw data for both the psychologist and the social
scientist were the same: Each started from observations concerning acts
of behavior. The distinction between psychological and social-scientific in¬
vestigations depended not upon any difference in data, but upon the systems
to which these data were related.
In speaking of “a system,” Radcliffe-Brown referred to a complex or¬
ganized whole, whose parts were so interrelated that one would alter their
natures if one attempted to abstract them from the system of which they
were parts. Thus, he frequently used the example of a human organism
as a paradigmatic case of what constituted a system; but it was not, of
course, with the human organism that he believed either the psychologist
or the social scientist should deal. As the above passage suggests, that with
which the psychologist deals are acts of behavior (including thoughts, be¬
liefs, feelings, etc.), and these acts form the system which constitutes the
mind of a given individual; the social scientist, on the other hand, deals with
acts by which persons are related to one another, and Radcliffe-Brown
assumes that such acts too form a system.35
Unfortunately, in discussing what were the elements or parts of the sys¬
tems with which social scientists deal, Radcliffe-Brown was less careful and
explicit than one should like him to have been. On the one hand he some¬
times spoke as if the related elements which comprise a social system were
individual persons, but he also spoke as if they were what he generally
called “social usages.”30 However, in spite of possible misunderstandings
with respect to this issue—which, as we shall see, is an important issue
for the problems which concern us—I believe that Radcliffe-Brown did
have a consistent position which runs through all of his works and which
can be stated in the following way.
318 2. STRUCTURE OF SCIENCE

The data of the social scientist are drawn from observations of those acts
of behavior which relate individuals to one another; therefore the essential
elements of a social system are individuals, and without human individuals
no social system would exist.37 Thus, in one sense, the units of a social
system are individuals, and Radcliffe-Brown sometimes explicitly says that
they are.38 However, since it is not the task of the social scientist to relate
the acts of any one individual to his other acts, it is not the individual as
such who can be regarded as comprising a genuine unit in a social system.39
Such units must be those acts which relate individuals to one another, and
Radcliffe-Brown calls attention to this fact in saying that the units of a
social system are “human beings regarded as sets of behavioral events
namely as the behavioral events through which they are related to one
another.40 And when he is most careful in discussing the units of a social
system, Radcliffe-Brown specifies another characteristic which such be¬
havioral events must have: They must be recurrent.41 Furthermore, we may
say that his view demands that these recurrent behavioral events lead to
continuity in the relatedness between individuals, since every system (ac¬
cording to Radcliffe-Brown’s use of that term) must have structural con¬
tinuity over a span of time.42 The acts which fulfill these conditions are
referred to by Radcliffe-Brown as “social usagesand of them he says:

A social usage is not merely a common mode of behavior. . . . [It] always in¬
volves a rule of behavior: there are proper or appropriate ways of behaving
under certain circumstances. ... A social usage is more than simply something
which people do. A, B, and C make bows; D recognizes that that is the way to
make a bow. The fact (1) that some or many people observe it, and (2) the
fact that a large number of people recognize it as a rule constitutes the reality
of a social usage.43

It is through such usages that individuals are related to one another in


persisting social systems. As Radcliffe-Brown put the matter:

It is the structural form of the society which the social scientist has to describe.
(One might call that form “non-mental” or “super-psychic” if one liked.) It
[the structural form] is explicit in the social usages.44

For this reason it is proper to consider social usages as the specific units
whose interrelationships form a social system.
In this connection there is one final point to be noted, and that is the
stress that Radcliffe-Brown placed on the fact that the total set of social
usages form a single, coherent system. As we have noted, he shared with
Malinowski an acceptance of the principle of interrelatedness: that the
various elements of a culture affect one another. When translated into
terms of social usages, this principle demanded that the specific, repeated
forms of behavior evinced by individuals in their relations to one another
form a coherent pattern, and it was to this principle that Radcliffe-Brown
referred when he discussed “the functional consistency of social systems.”46
MAURICE MANDELBAUM 319

That there is functional consistency within any particular system implies


that (in the long run) various social usages cannot be in conflict with one
another. However, Radcliffe-Brown pointed out that in addition to this
minimal form of consistency, in most societies these usages react upon each
other and reinforce one another, thus leading to a marked degree of integra¬
tion within the social system.46
Having defined Radcliffe-Brown’s position with respect to the elements
which enter into and form the nature of a social system, we are in a better
position to understand what he was claiming when he insisted that the
elements of such a system were to be understood through understanding
their functions in maintaining the coherence and continuity of that system
as a whole. When he says, as he did in The Andaman Islanders, “Every
custom and belief of a primitive society plays some determinate part in
the social life of the community, just as every organ of a living body plays
some part in the general life of the organism,”47 this is not to be understood
as signifying that there is, so to speak, some selective power in the society
as a whole which accepts or rejects specific beliefs and customs, only allow¬
ing those to continue which are of benefit to it. Customs and beliefs are
social usages, and no such usage can long persist if it stands in conflict with
other usages which are present within the system. It is in this sense that
the system as a whole may be said to “select” or to “reject” particular
forms of behavior. Bearing this interpretation in mind, one need not read
Radcliffe-Brown as if he were holding that a social system, considered as
a whole, exercises a selective power and control over each of its com¬
ponents, in some way guiding them for its own good.48
To be sure, one may wish to criticize Radcliffe-Brown’s theory for having
failed to indicate that the relationship between some elements and the func¬
tioning of the system as a whole is a very indirect relationship (as is also
the case with respect to some features of organic systems). Were such a
criticism justified (as I believe it to be), it would considerably weaken
Radcliffe-Brown’s contention that the proper way to understand every
social usage is through relating it to the role which it plays in maintaining
the coherence and the continuity of the system as a whole.49 One might
also wish to criticize him (as has often been done, and as I should be in¬
clined to do) for having overstressed the elements of coherence and con¬
tinuity in social systems, and for having failed to see how various institutions
may change radically, and even suddenly, and alter the system in more
drastic ways than he was inclined to take into account.50 Yet, such possible
criticisms do not suggest that his view of the relationships among the ele¬
ments in a social system rested on false analogies to biological systems or
on methodologically objectionable forms of teleological explanation.
What has generally been considered methodologically objectionable in
classic formulations of teleological explanations is, basically, the fact that
they failed to establish any means by which the purposiveness attributed to
320 2. STRUCTURE OF SCIENCE

an organism could be effective in the present, and could lead that organism
to the future goal which the teleologist claims that it may be said to seek.
Whatever other faults his views may have, no such charge can justifiably
be leveled against the type of functional explanation which Radcliffe-Brown
offered in his accounts of anthropological facts.51 As we have seen, his
theory did not deny the existence of psychological determinants in func¬
tional explanations, and such explanations (as he employed them) in fact
rested upon the existence of these determinants. And we are now in a posi¬
tion to see why this should have been the case. A social system is a set
of interrelated social usages; social usages, however, are acts of individuals;
they are acts which “while they characterize a certain number of individuals
are the product in these individuals of the action upon them of other
individuals within a specific social system.”52 This being so, the whole
dynamism of any social system will rest upon the interactions of individuals,
not upon an unspecified and “occult” relationship whereby the supposed
needs of the system as a whole presumably make themselves directly felt in
each of its component parts. That Radcliffe-Brown’s emphasis on the role
of individuals in a society was basic to his functionalism throughout his
career can perhaps best be seen through citing a passage from the preface
to the revised edition of The Andaman Islanders (1933), in which he at¬
tempted to specify the meaning of “function”:

The notion of function in ethnology rests on the conception of culture as an


adaptive mechanism by which a certain number of human beings are enabled
to live a social life as an ordered community in a given environment.53

However, if this is taken as his view of the nature of functional explanations,


and if he too emphasized psychological determinants in such explanations,
what_one may wonder—was the basic difference between his view and the
later functionalism of Malinowski?
The difference, I submit, may be stated in terms of the very different roles
which psychological determinants played in each theory. Both Radcliffe-
Brown and Malinowski viewed a social system as constituting a whole, and
both conceived of any such system as having a number of institutional com¬
ponents, or elements. Furthermore, each regarded it as necessary to appeal
to psychological determinants in analyzing the functioning of social systems.
However, according to Malinowski’s view (and now I am only discussing his
later functionalism) it was not possible to understand the structural ele¬
ments within a society without viewing them as reflections of a set of basic,
universal needs. In Malinowski’s system, therefore, the psychological deter¬
minants of a society consisted in a set of psychological constants which
underlay all of the actions of individual human beings: The comparability
of social systems depended for him on the universality and stability of these
basic needs.54 Radcliffe-Brown, on the other hand, denied that any generali¬
zations of the science of psychology could provide such a basis for social
MAURICE MANDELBAUM 321

anthropology. While he did admit that one could meaningfully speak of a


basic human nature, which it was the business of general psychology to
study, he did not regard generalizations concerning these universal features
of human beings as providing a basis for the analysis of social institutions.
It was his contention that, in addition to whatever factors all human beings
had in common, there were “special psychologies” for different social
groups.55 On his view, men were malleable, and thus special characteristics
were imparted to them by the social usages under which they lived. There¬
fore, psychological determinants could not be regarded as the foundations
upon which the nature of social usages rested; they were the mechanisms
by means of which these usages tended to agglutinate into a single, coherent,
self-maintaining system. As a consequence, Radcliffe-Brown’s view differed
from Malinowski’s in the fact that he rejected the assumption that the com¬
parability of social systems rested upon constancy in the nature of human
beings. Instead, he believed that the constants in social anthropology were
to be found in the means by which social systems maintained themselves. On
his view, it was the task of social science to discover these constants, and to
state them as laws. In short, according to Radcliffe-Brown, a science of
society was independent of a theory of psychological constants: Its task was
to state the abstract general conditions under which systems of social usages
maintained themselves, and the conditions under which they underwent
change.56
Such a definition of the task of the comparative study of societies not only
distinguishes the functionalism of Radcliffe-Brown from the later function¬
alism of Malinowski, it also serves to distinguish it from Malinowski’s earlier
views. Those views were not put forward as a means of establishing a basis
for comparative studies: Unlike Radcliffe-Brown, it was only late in his
career that Malinowski showed an interest in such studies. In what I have
termed his early functionalism, Malinowski regarded the principle of inter¬
relatedness as having a genuinely explanatory function in social anthro¬
pology: One could understand the character of one element in a culture
through seeing how it was related to all of the other elements in the same
culture. Radcliffe-Brown, however, must be interpreted as having rejected
the view that a descriptive method of this sort can be regarded as “explana¬
tory”; he insisted most strongly on drawing a sharp distinction between
ideographic and nomothetic enquiries, and it was his contention that a
science of society was exclusively concerned with establishing acceptable
general propositions.57 Malinowski, too, in his later theory, sought to go
beyond the principle of interrelatedness, and attempted to formulate a sub¬
stantive theory which could serve to explain the basis of each essential
feature of man’s social life. Thus, in one respect at least, Malinowski s later
thought and the views of Radcliffe-Brown were at one: Both conceived of
a proper social science as being one which was capable of explaining what
all societies had in common. Malinowski sought to relate such features to
322 2. STRUCTURE OF SCIENCE

a universal set of basic needs; Radcliffe-Brown rejected this psychological


approach and sought to find the common features in the needs of the society
as a self-maintaining system. Each assumed that a generalizing science of
social life had to be general in a very special way: Before it set out to explain
any particular feature of social systems it had to establish the most general
features of all such systems and account for them.58 When this aspect of
the thought of Malinowski and of Radcliffe-Brown is stressed, one can see
why philosophers have been skeptical as to whether functionalism can pro¬
vide an adequate theoretical framework for explanations in anthropology
and in sociology. It is to a brief elucidation of this point that I shall now
turn.

CONCLUSION: FUNCTIONALISM AND THE NATURE OF


EXPLANATION

In the foregoing account we saw that the basic starting point of function¬
alism for both Malinowski and Radcliffe-Brown was the doctrine that all
elements within a social system are mutually related; as Bateson pointed out,
it was this thesis which characterized “the Functional School.” However, as
we also saw, Malinowski’s later functionalism operated on a quite different
theoretical level: Its purpose was to account for the universal elements in
cultures, rather than being confined to a thesis concerning the relations ob¬
taining among the coexisting elements in any one culture. It should be
obvious that even if we assume both aspects of Malinowski’s views to have
been sound, they were independent of one another; and we may note that,
if sound, they would play quite different roles in the economy of our knowl¬
edge of societies. Malinowski’s early functionalism was a descriptive thesis
concerning specific cultures, and his later functionalism was concerned with
comparisons of the elements to be found in all cultures. Radcliffe-Brown’s
theory operated on these same two levels. As we have seen, he shared
Malinowski’s descriptive thesis concerning the interrelatedness of the ele¬
ments within any one culture, and he also sought to generalize concerning
the features which all societies have in common. On his view, however, these
common features were structural principles making for the cohesiveness
and continuity of societies rather than being specific, repeated elements tied
to individual needs.
Now, it seems to me clear that the first, or descriptive, principle in each of
these theories ought not to be treated as if it were intended to provide an
empirical law. To view it as a law would be to render it either empirically
false, or so to modify it that it would become too weakened to possess any
explanatory power. This can be suggested in the following way.
If the principle of interrelatedness were to be formulated as a law, it
would very probably be stated in something like the following form: In any
MAURICE MANDELBAUM 323

culture, all elements within the culture are related to all other elements in
such a way that none would be as it is if the other elements were different
from what they are. However, such a law would imply that any change in
any element of any culture would entail that all other elements in that cul¬
ture would also undergo change. This principle can surely not be thought to
be true, and neither Malinowski nor Radcliffe-Brown can easily be inter¬
preted as having believed it to be true.59 However, if the principle were to
be weakened in a manner that would make it empirically plausible, it would
probably have to be stated as holding that many elements in a culture have
some effect on one another; that no element can be understood as a unit
which stands in isolation from all others. This weaker principle was indeed
advocated by Malinowski and by Radcliffe-Brown, and their work—as well
as the work of their followers—has assuredly established its plausibility; it is
a principle which has, in their hands, undermined some of the assumptions
shared by most earlier diffusionists, and before them by most who ap¬
proached social anthropology from an evolutionary or an “antiquarian”
point of view. However, when phrased in this weakened form, the principle
no longer permits one to deduce specific consequences in specific instances.
It could therefore not perform the explanatory function of a law on the
classical deductive-nomothetic model. Nor is it likely that it could be suc¬
cessfully rephrased in probabilistic terms, and be used in explanatory gen¬
eralizations of a probabilistic type.
Under these circumstances, one might seek to interpret the descriptive
principle of interrelatedness (as this principle was used by Malinowski and
by Radcliffe-Brown) as a merely methodological principle of heuristic sig¬
nificance. On such a view, these functionalists would be interpreted as having
merely held that one should always look for interrelationships among the
elements of a culture, for fear of misunderstanding them, or of misdescribing
them, because one may have overlooked their relationships to other elements
with which they were connected.60 However, I believe that this strictly
heuristic interpretation of the principle of interrelatedness would assuredly
be too weak an interpretation to be an accurate reflection of Malinowski’s
thought, and I believe—though with perhaps less assurance—that it would
also be too weak an interpretation to do justice to the views of Radcliffe-
Brown, and of functionalism in general. What was at stake in functionalism
was not merely an important heuristic device but an explanatory principle:
The various elements in a culture were held to be what they actually were
because of their interconnections with one another. This explanatory prin¬
ciple, it should be noted, was descriptive in character, and was not a law-like
generalization from which further specific consequences were expected to
be deduced. Thus, if I am not mistaken, the position of functionalism—so
far as the principle of interrelatedness was concerned—presupposed that,
in some cases at least, descriptive analyses constitute explanations of par¬
ticular phenomena. A position of this sort—though making use of a different
324 2. STRUCTURE OF SCIENCE

type of descriptive analysis—has become familiar to us through William


Dray’s discussion of “the continuous series model of explanation.”01 In
Malinowski’s early functionalism, the characteristics of a particular phe¬
nomenon were held to be explained when it was shown why this phenom¬
enon was as it was; and an explanation of its nature was assumed to be
attainable through pointing out how it was related to a wide variety of other
contemporary phenomena in the same culture.62 Thus, while functionalist
explanations of this type stand opposed to genetic investigations, they at¬
tempt to produce synchronic analyses which in other respects resemble the
diachronic approach of traditional historical investigations. This being the
case, it is obvious that insofar as one takes it to be true that all “explanation
must conform to the classical deductive-nomothetic model, or else to proba¬
bilistic models, descriptive analyses of the foregoing type will not be taken
to be explanatory. As a consequence, neither the functionalism of Mal¬
inowski nor that of Radcliffe-Brown will appear to be adequate to many
philosophers of science.
Turning now to the second level of theory which is to be found in both
Malinowski and Radcliffe-Brown, we can see why their cross-cultural gen¬
eralizations have also failed to satisfy philosophers of science. And here, I
may say, my sympathies lie wholly with their critics. It will be recalled that
Malinowski sought to establish a general scientific theory of culture by relat¬
ing a set of universal cultural elements to a set of universal human needs.
However, as numerous critics have pointed out, he never succeeded in
showing in concrete instances that the specific forms which those elements
assumed were necessary to the satisfaction of these needs; nor did he ever
attempt to show that the existence of these needs (assuming them to be
universal) provided sufficient conditions for the existence of the particular
elements whose universality he claimed to have established. Thus, there
was a looseness of connection between the purportedly universal and con¬
stant factors in human nature and human culture and the specific elements
with which Malinowski and other descriptive anthropologists found to be
present in different cultures. Nor was Radcliffe-Brown’s theory of the neces¬
sity of certain practices for the survival of a society any less vulnerable to
criticism from the same point of view.03 Thus, the attempt to establish laws
of social organization on the assumptions of either Malinowski or Radcliffe-
Brown ended in failure: Their generalizations permitted no deductive con¬
sequences with respect to the specific nature of the practices of the peoples
which they, and other descriptive anthropologists, investigated.
Given the inadequacies of this second level of theory in functionalism,
and given the present reluctance of many philosophers of science to accept
descriptive analyses as offering explanations, it is small wonder that the
movement known as functionalism in social anthropology should have fallen
into serious disrepute among philosophers of science. The blame, however,
can scarcely be attributable to narrowness or rigidity on the part of the
MAURICE MANDELBAUM 325

philosophic critics of functionalism. If, instead of seeking highly general


principles which would delineate what all cultures, as wholes, have in
common, Malinowski and Radcliffe-Brown had moved more continuously
from the first level of their theories—their insistence on the principle of
interrelatedness—to a comparative study of the ways in which the various
elements of culture are related to one another, more accurate and more
testable generalizations might have resulted. This, however, would have
involved abandoning the attempt to find what I have elsewhere termed
‘global laws” in favor of seeking “abstractive laws” concerning the specific
relations of elements within societies.64 Such abstractive laws would attempt
to correlate the specific nature and changes of specific elements of social
structure with one another, and would thus be “functional” in one of the
classic senses of that term: Two properties, events, or other characteristics
may be said to be “functionally” related if there exists a nonaccidental
covariance between them.65 If I am not mistaken, it was—at least in part—
this conception of “function” which had been suggested by the original
descriptive analyses of Malinowski and of Radcliffe-Brown, but which had
unfortunately been minimized by them in their emphasis upon a general
theory of societies. And it is this original and important aspect of their
theories which is, in my opinion, apt to be neglected when functionalism in
social anthropology is assumed to be modeled on organismic biology, and
when it is interpreted as an example of teleological explanation—whether
the teleology in question be that connected with traditional vitalism, or
whether it be the quasi-teleology imputed to self-regulating mechanisms
which operate through negative feedback.

NOTES
1. “Teleological Explanation and Teleological Systems” was published in 1953
in Vision and Action, edited by Sidney Ratner. “A Formalization of Functionalism”
dates from the same year but was first published in 1956 in Logic Without Meta¬
physics. Since the latter paper was closely tied to Robert K. Merton’s essay on
“Manifest and Latent Functions” (cf. note 3, below), many of its aspects are not
incorporated in The Structure of Science; it should therefore be separately consulted.
However, so far as I can see, the position expounded in The Structure of Science
does not depart from this earlier formulation of Nagel’s view. (A much briefer dis¬
cussion of functionalism is to be found in his contribution to a symposium on
“Problems of Concept and Theory Formation in the Social Sciences,” American
Philosophical Association, Eastern Division, vol. I [University of Pennsylvania Press,
1952].)
2. It has, of course, also been one of the chief sources of much of this work.
Among the philosophers of science who have dealt with the problem, I might
mention Sidney Morgenbesser’s “Role and Status of Anthropological Theories,”
Science, vol. 128 (1958), pp. 285-288; Carl G. Hempel’s “The Logic of Functional
Analysis,” originally published in 1959, but reprinted in a revised version in his
Aspects of Scientific Explanation (1965); Israel Scheffler’s Anatomy of Inquiry
(1963), especially Sections 5 and 9 of Part I; and Chapter IX of Robert Brown’s
Explanation in Social Science (1963).
In Dorothy Emmet’s Function, Purpose and Powers (1958), one finds a very
similar linkage of functionalism in social anthropology with questions concerning
teleological systems.
3. In the present paper I shall not deal with functionalism in sociology. It is a
326 2. STRUCTURE OF SCIENCE

considerably later movement, since functionalism in soeiai anthropology can be


regarded as originating with publications by Malinowski and by Radcliffe Brown
1922 whereas 1949 is the year in which functionalism became an important issue
in general sociology: it was in that year that Talcott Parsons Essays in Sociological
Theory appeared, and Robert Merton published his classic article, Manifest and
Latent Functions,” in Social Theory and Social Structure. The connection be twee
functionalism in anthropology and Parsons views is evident in fthe latter s essay o
“The Theoretical Development of the Sociology of Religion, which is included
in the 1949 volume of essays. In the case of Merton’s article, the evidence of con¬
nection is unambiguous. (For example, cf. Social Theory and Social Structure,

SeCAmongltthe’other important discussions of functionalism in sociology, the reader


may be referred to Marion J. Levy’s The Structure of Society (1952), and to the
following articles: Harry C. Bredemeier, “The Methodology of Functionalism,
American Sociological Review, vol. 20 (1955), pp. 173-180; Bernard Barber,
“Structural-Functional Analysis: Some Problems and Misunderstandings, American
Sociological Review, vol. 21 (1956), pp. 129-135; Kingsley Davis The Myth of
Functional Analysis,” American Sociological Review, vol. 24 (1959), pp. nl ttL,
Ronald P. Dore, “Function and Cause,” American Sociological Review, vol. 26
(1961), pp. 843-853. . , ,
In addition, George C. Homans has dealt with the meanings of functionalism on
frequent occasions; the last of these being his presidential address to the American
Sociological Association in which he severely criticized the position. (Cl. ^ Bringing
Men Back In,” American Sociological Review, vol. 29 [1964], pp. 809-818 ) For
his earlier analyses, see The Human Group (New York, 1950), pp. 268—272, his
review of Radcliffe-Brown, Structure and Function, in American Anthropologist,
vol. 56 (1954), pp. 118-120; and Sentiments and Actions (New York, 1962),

4. Recent uses of the term “teleology” have extended its application to a wider
variety of phenomena than had previously been regarded as being “teleological.
For example, the term is presently applied to man-made physical systems regulated
by negative feedback and to those instances in which human action is motivated by
attempts to attain future goals, as well as being applied to biological phenomena
as these phenomena are interpreted by vitalists. One can, of course, find connections
among these three types of use, as well as connections with other traditional uses
(e.g., the theological uses of the term). However, so far as I can see, no one of
these three uses has any necessary connection with functionalism as a theory in
social anthropology, and it will be my purpose to suggest that Hempel was mistaken
when he said: “Historically speaking, functional analysis is a modification of teleo¬
logical explanation, i.e., of explanation not by reference to causes which bring
about’ the event in question but by reference to ends which determine its course”
(loc. cit., p. 303).
5. A useful bibliography of Malinowski’s writings, and of the literature concern¬
ing him, is to be found in Raymond Firth (ed.), Man and Culture, An Evaluation
of the Work of Bronislav Malinowski (London, 1957). Some of the articles cited
in that bibliography have more recently been reprinted in a collection entitled Sex,
Culture and Myth (New York, 1962).
6. Argonauts, p. 11. Although Malinowski’s first book, The Family Among the
Australian Aborigines (1913), was based upon presuppositions which he soon
abandoned, one finds in its concluding chapter expressions of precisely the same
point of view as that cited above: The institution of the family, Malinowski argues,
cannot be understood in isolation from its social conditions, e.g., from territorial
and tribal structure, land ownership and economic practices, and moral, customary,
or legal norms (cf. p. 293 and p. 301 of the 1963 edition, edited by J. A. Barnes).
7. Argonauts p. 515 (italics mine). That the interplay of various aspects of a
culture was basic in Malinowski’s earlier views regarding methodological and theo¬
retical issues also becomes clear in his special preface to the third edition (1932)
of The Sexual Life of Savages (cf. especially p. xx).
8. Argonauts, p. 516.
9. I do not deny that the unsatisfactoriness of many diachronic studies and
their highly conjectural nature, as well as a variety of personal factors, played a
part in Malinowski’s anti-historical bias. I merely wish to point out that if one takes
his anti-abstractionist thesis as seriously as he himself did, then there are good
theoretic reasons for the position that he adopted. This point comes out very clearly
in Sex and Repression in Savage Society (1927), pp. 181-182.
To be sure, one might accept this early functionalism and yet be interested in
MAURICE MANDELBAUM 327

tracing some dynamic, directional process in the society as a whole. However, on


Malinowski’s premises, any such interest could only be satisfied after the primary,
synchronic studies had been adequately carried out.. Thus, he rejected diachronic
studies as not germane to his own immediate task. However, as the closing sen¬
tences of Crime and Custom in a Savage Society (1926) make clear, he did not
exclude the possibility of future studies of historical and evolutional factors in
society; and this position was one which he repeated at frequent intervals, e.g., in
the foreword to the third edition (1932) of The Sexual Life of Savages, and in
Chapter III of his posthumously published Dynamics of Culture Change (1945).
10. In his History of Ethnological Theory (1937), R. H. Lowie gives an account
of Malinowski’s thought which may be characterized as vituperative in tone, but
which is accurate in content; he too finds the emphasis on an interdependence of
the aspects of a culture as the characteristic feature of Malinowski’s early function¬
alism.
A similar interpretation of Malinowski’s early functionalism is implicit in remarks
made by J. A. Barnes in his helpful introduction to the 1963 reprint of Malinowski’s
The Family Among the Australian Aborigines (q.v., p. xix). One may also note
that Audrey Richards comments on how mistaken it is to identify Malinowski’s
functionalism with a theory of human needs and to contrast it in this respect with
the functionalism of Radcliffe-Brown (cf. “The Concept of Culture in Malinowski’s
Work,” in Firth, Man and Culture, p. 17, n. 2). I shall return to this point in dis¬
cussing Malinowski’s later form of functionalism.
11. Cf. Naven, pp. ix, x, 112n., 258, et pass.
12. In retrospectively calling attention to his concern with the concept of basic
human needs, Malinowski himself specifically referred to the same article, which
had been published in 1931 (cf. his posthumous Scientific Theory of Culture, p. 22).
13. This, of course, is a fundamental thesis in Crime and Custom, and in this
respect Malinowski is frequently critical of the work of Durkheim. A further, and
not wholly separable, criticism of Durkheim is that the latter’s emphasis on the
collective nature of social phenomena led him to neglect (according to Malinowski)
the biological basis of cultural facts (cf. his preface to Hogbin, Law and Order,
p. xxxviii, and Malinowski, The Dynamics of Culture Change, p. 42). In the latter
respect, Malinowski linked Radcliffe-Brown with Durkheim, as the first of the latter
passages will show.
14. Even in his last work, when—as we shall see—the focus of his interests had
somewhat changed, and he had come to emphasize psychological needs as underlying
institutions, Malinowski had insisted on viewing these institutions as what he termed
“the legitimate isolates of cultural analysis” (Scientific Theory of Culture, pp. 160-
161, also, p. 54). For him, such isolates were entities to be utilized both in observa¬
tion and in theoretical discourse (ibid., p. 27). Furthermore, as we shall see, he
disagreed profoundly with Benedict’s denial of universals in culture, i.e., with her
view that specific cultures fail to satisfy the whole range of man’s basic needs (ibid.,
p. 40).
15. Benedict, Patterns of Culture, pp. 42-43 of Mentor Books edition. As Boas
pointed out in his introduction to this book, Ruth Benedict’s interest in the total
configuration of a culture “is distinct from the so-called functional approach to social
phenomena in so far as it is concerned rather with the discovery of fundamental
attitudes than with the functional relations of every cultural item.”
16. Ibid., pp. 45-46. For an example of Malinowski’s contemptuous rejection of
Benedict’s views, cf. “Culture As a Determinant of Behavior” in Factors Determin¬
ing Human Behavior (Harvard Tercentenary Publications, 1937), p. 143.
17. It is now not unusual to find that Malinowski’s functionalism is solely iden¬
tified with his later attempt to relate social institutions to biological and psycho¬
logical needs. As I have indicated, Audrey Richards comments on this interpretation
(cf. her contribution to Firth, Man and Culture, p. 17, n. 2). It is implicit in the
views of E. R. Leach (cf. his contribution to the same volume and his essay on
Malinowski and Frazer in Encounter, November 1965), and the same truncated
view of Malinowski’s theoretical position is present in Leon J. Goldstein “The Logic
of Explanation in Malinowskian Anthropology,” Philosophy of Science, vol. XXIV
(1957), pp. 156-166. In his polemics against Malinowski, Radcliffe-Brown in the
end is also guilty of this error (cf. “Functionalism: A Protest,” American Anthro¬
pologist, vol. LI [1949], p. 320f.). However, to view Malinowski’s position in this
way is to fail to understand the theoretical views which originally guided his field
work, and the nature and extent of his influence on an important generation of
anthropologists. . .
18. One would not, of course, have to conceive of rules in terms of reciprocal
328 2. STRUCTURE OF SCIENCE

obligations, but it is clear that Malinowski did so conceive of them. (For^example,


cf. p. xxxv of the same preface, as well as Malinowski s own analysis of law in

Cn\ge Tcf be sure, hicompatabilities of obligations may exist in the experience of


any one individual and may even be fairly characteristic of certain groups of in¬
dividuals; however, such incompatabilities cannot (on Malinowski s view ot t e
nature of an institution) be general and continuing.
20. For example, in his preface to Hogbin, Law and Order, p. xxxi.
21. Ibid., pp. xxxvii f. Malinowski’s doctrine of sentiments owed much to the
psychology of A. F. Shand, as is clear in the preface to Sex and Repression in Savage
Society. (Cf. also the remarks on this point in the essays of Firth and of Fortes in
Man and Culture, edited by Firth.) . . . M(1,.
22. Audrey Richards tends to hold that this emphasis was present m Malinow¬
ski’s early functionalism (cf. her essay in Firth, Man and Culture, p. 17), although
she admits that it is difficult to know whether the idea of the unity of a culture
preceded the idea of the interconnectedness of its elements, or whether it tollowea
from it (ibid., p. 18). The interpretation which I am offering assumes that the latter
is Malinowski’s most usual view. That there is an ambiguity in this matter can
perhaps be best seen by examining the concluding paragraph of Malinowski s Crime
and Custom, where both points of view are stressed. A combination of both points
of view is also present in a brief passage quoted from an early (1912) article of
Malinowski’s in Radcliffe-Brown, “A Note on Functionalism,” Man, vol. XLV11
(1946), 30, p. 38b. , . . . . -
23. A Scientific Theory of Culture, p. 159. As we have noted, in his preface to
Hogbin’s Law and Order, Malinowski remarks that the fundamental difference
between his views and those of Radcliffe-Brown (as well as the school of Durkheim),
lies in the fact that the latter neglect the individual and the biological factor (i.e.,
the factor of individual needs) in culture (cf. p. xxxviii). The same point is repeated
in “The Group and the Individual in Functional Analysis,” American Journal of
Sociology, vol. 44 (1939), p. 939, note. (Now also available in Malinowski. Sex,
Culture, and Myth, p. 224 n.) ,., 1Q0,
24. An early, isolated passage of this sort is to be found in Malinowski s lylj
essay “Magic, Science, and Religion” (cf. Anchor edition of the volume bearing
the same title, pp. 39-40). However, the chief passage of this sort is in pp. 168-170
of A Scientific Theory of Culture. However, the latter passage ends with the remark
that such a use of the concept of function is to be considered primarily as a heuristic
device, and Malinowski immediately returns, in the next pages, to his theory of
biological and psychological needs.
25. Of course. Radcliffe-Brown amplified his theoretical position in his later works
(cf. A Scientific Theory of Society, recorded in 1937, but not published until 1948,
and the later contributions included in the posthumous volume, Method in Social
Anthropology). However, it seems fair to say that the chief alteration in his various
later formulations of his position was that he came more and more to state his
theory as a theory of “the structure of society.” This later insistence that social
anthropology deals with “society,” and not “culture,” and his growing emphasis on
the term “structure,” were in large measure to be understood as reactions against—
and, indeed, as protests against—Malinowski’s position. (For Radcliffe-Brown’s most
explicit statement concerning his relationships to Malinowski, cf. “A Note on Func¬
tional Anthropology,” Man, vol. XLVI [1946], 30, pp. 38-41.)
26. The Andaman Islanders, p. 324. I quote from the 1933 edition, but the body
of that text appears to be unchanged. However, the Preface was substantially
altered, and Appendix B (on the Andaman Languages) was entirely rewritten.
27. Cf. The Andaman Islanders, p. 229. For a passage in which Radcliffe-Brown
acknowledged the convergence of his views and that of Malinowski and of Margaret
Mead with respect to this issue, cf. his address “The Present Position of Anthro¬
pological Studies” (1931), reprinted in Method in Social Anthropology. (The refer¬
ence is to be found on page 70 of that volume.)
28. Among the many statements which Radcliffe-Brown made concerning the
essential nature of functional explanations in social science, perhaps the clearest are
to be found in his article “On the Concept of Function in Social Science,” American
Anthropologist, vol. 37 (1935), pp. 394—402. However, it may be well to cite addi¬
tional passages, starting with The Andaman Islanders, pp. 229-230, and continuing
in chronological order through a statement in Method in Social Anthropology, p. 62,
to A Natural Science of Society, pp. 85 and 154-156, and to the introduction to
Structure and Function in Primitive Society, p. 12. These are, of course, merely a
few of the references which are directly relevant.
MAURICE MANDELBAUM 329

29. Cf, above, note 13. The same linkage is to be found in R. H. Lowie’s treatment
of Radcliffe-Brown in his History of Ethnological Theory: He discusses Radcliffe-
Brown in connection with “the French School,” rather than in connection with
Malinowski and functionalism.
Radcliffe-Brown readily acknowledged his debt to Durkheim and to the latter’s
followers. For example, see the prefaces to The Andaman Islanders, and also the
important acknowledgement in the footnote on page 325 of that work. Among the
many other passages which establish this connection, I shall cite only the following:
Structure and Function in Primitive Society, pp. 14, 166, 176, and 200. However,
I should like also to call attention to the fact that in his article “On the Concept of
Function in Social Science,” (American Anthropologist, vol. 37 [1935], p. 394)
Radcliffe-Brown attributed the basic concept of the function of a social institution
to Durkheim.
30. According to Radcliffe-Brown, one of the major factors on which the existence
of any society depends is the existence in it of a shared body of moral customs,
and these he regarded as dependent upon sentiments (cf. The Andaman Islanders,
pp. 400 ff.). Thus he says: “These sentiments and the representations connected with
them, upon the existence of which, as we have seen, the very existence of the society
depends, need to be kept alive, to be maintained at a given degree of intensity.
Apart from the necessity that exists of keeping them alive in the mind of the in¬
dividual, there is the necessity of impressing them upon each new individual added
to the society, upon each child as he or she develops into an adult” (p. 404).
31. For a concise statement of the empirical assumptions involved in Radcliffe-
Brown’s position, see the five points which he takes as his working hypothesis: In
each of them the concept of a sentiment is central (The Andaman Islanders, pp.
231-232).
While Radcliffe-Brown did explain some ceremonials, such as the rite of weeping
(p. 245) and ceremonial dancing (p. 252), in terms which were entirely consistent
with this working hypothesis, it may be doubted whether the same can be said with
respect to his explanations of bodily ornamentation (cf. pp. 320-323). In the latter
explanations he failed to show how sentiments were actually involved. This suggests
that he had either failed to state his working hypothesis with sufficient clarity, or that
he had failed to test it with sufficient rigor.
32. All three essays cited above are reprinted in Structure and Function in Primi¬
tive Society. The quotation from “Taboo” is to be found on page 144 of that volume.
33. Structure and Function in Primitive Society, p. 157. Similarly, he says:
“The social function of the rites is obvious: by giving solemn and collective expres¬
sion to [a particular system of sentiments] the rites reaffirm, renew and strengthen
those sentiments on which the social solidarity depends” (ibid., p. 164).
34 .A Natural Science of Society, p. 45. This work was originally published in
1948, but I shall cite the 1957 reprinting of it.
For a discussion of Radcliffe-Brown’s use of the concept of systems, their dis¬
tinction from classes, and their relation to his view of natural laws, cf. especially
pp. 19-22 of the same work.
35. Ibid., p. 47. In my opinion it is true that social scientists do deal with what
Radcliffe-Brown terms systems of acts through which individuals are related to one
another. However, to define the province of social science (as distinct from psychol¬
ogy) in this way cannot be considered adequate. In the first place, not every set of
acts through which persons are related forms what Radcliffe-Brown characterizes
as a system. In the second place, it is not likely that we would regard it as within
the province of a social scientist, rather than a psychologist, to investigate all cases
in which behavior evinces systematic connections between the interrelated acts of
individuals. (For example, an extreme case of sibling rivalry in a particular family
would not usually be assigned to the field of social science rather than psychology.)
36. In his last, unfinished work—a projected portion of a text on social anthro¬
pology—he used the term “institutions” in much the way he had formerly used the
concept of “social usages.” This fragment is published as Part II of Method in Social
Anthropology, and his use of the term “institution” is to be found in his discussion
of “Social Structure” in that place.
37. I here exclude from consideration the question of whether one can speak of
animal societies as being “societies” in the same sense as human societies. In doing
so, I am following Radcliffe-Brown who distinguishes between them on the ground
that animal societies are based exclusively (or almost exclusively) on instinct rather
than on instinct plus culture (cf. A Natural Science of Society, p. 91).
38. For example, in A Natural Science of Society, p. 49.
39. Radcliffe-Brown uses the term "unit,” rather than “part” to refer to the
330 2. STRUCTURE OF SCIENCE

relata within a system, as contrasted with the members of a class (cf. A Natural
Science of Society, p. 22). , .
40. A Natural Science of Society, p. 26. And in a posthumous fragment published
in Method in Social Anthropology, he suggested that “the matter” of human societies
consisted in human beings, whereas their “forms” consisted in the ways in whic
these individuals were connected by institutional relationships (ct. p. 176).
41. Cf. A Natural Science of Society, p. 53.
42. A Natural Science of Society, pp. 24-26.
43. A Natural Science of Society, p. 56.
44. A Natural Science of Society, p. 56. _ , , „
45. Cf. A Natural Science of Society, pp. 124—128; cf. also Structure and Function
in Primitive Society, p. 43.
While Malinowski would not have denied that an overall functional consistency
normally characterizes the culture of any society, in Crime and Custom in Savage
Society, and elsewhere, he emphasized the existence of deviant behavior among in¬
dividuals. To what extent Radcliffe-Brown’s theory of the mechanisms underlying
the formation of social usages permits him to account for deviant behavior is, I
believe, an important open question. , ,
46. At this point it may be useful to call attention to the fact that Radcline-
Brown did not share what may be termed the “organismic” conception of a culture
that one finds in Ruth Benedict, among others. Like Malinowski, he saw a social
system as composed of elements, and it was the interrelatedness of these elements
that gave the system the degree of unity which it possessed. In various of Ruth
Benedict’s utterances, however, she spoke as if she intended to hold that it was
the system as a whole which determined the characteristics of the units which com¬
prised it. Though Radcliffe-Brown also used the catch phrase “the whole is greater
than the sum of its parts,” distinguishing dynamic systems from mere aggregates,
he never suggested that the system as a whole was to be regarded as self-determining.
In fact, it was the essence of his functionalism that the persistence of the whole was
only made possible by the ways in which each element contributed to it. In this
respect, then, his position was closer to that of Malinowski than it was to that of
Benedict.
47. The Andaman Islanders, p. 229.
48. There are, however, passages in The Andaman Islanders, and elsewhere,
which might be regarded as setting forth such a view. For example, in that work he
puts forward the following statement:
By its action upon the individual the ceremonial develops and maintains
in existence in his mind an organized system of dispositions by which the
social life, in the particular form it takes in the Andamans, is made pos¬
sible, using for the purpose of maintaining the social cohesion all the
instinctive tendencies of human nature, modifying and combining them
according to its needs (p. 327).
Although one can clearly see in such a passage that Radcliffe-Brown wished to
emphasize the needs of a society conceived as a whole, if my interpretation of his
thought has been correct, one need not interpret such passages as being meant to
convey the view that a society has the power to select which social usages shall and
shall not have a place within it. It is the combination and mutual reinforcement of
these social usages, through their effects on individual behavior, which support the
system as a whole.
49. He does in fact usually state his view in such a way as to imply that every
element contributes directly to the system as a whole. Cf. the passage cited above
from page 229 of The Andaman Islanders. In his article “On the Concept of Func¬
tion in Social Science,” he is, however, somewhat more cautious: He only takes this
to be a “working hypothesis” (cf. American Anthropologist, vol. 37 [1935], p. 399).
50. To be sure, Radcliffe-Brown did recognize the pervasiveness of social change,
and did regard it as an important problem for the social anthropologist (cf. A
Natural Science of Society, pp. 86-89). However, his emphasis was upon synchronic
analyses, and he regarded it as legitimate to treat these independently of questions
regarding processes of change.
51. Radcliffe-Brown did not discuss biological explanation in sufficient detail to
make me certain that the charge might not apply to him in that sphere—though from
the general tenor of his discussions of scientific explanation in A Natural Science of
Society, I should be surprised if such a charge were justified with respect to his
biological views. However, I am here solely concerned with the charge as it is
presumed to apply to his social anthropology.
MAURICE MANDELBAUM 331

52. A Natural Science of Society, p. 106.


53. Page ix. It is of interest to note that in The Andaman Islanders, and in his
other earlier works, Radcliffe-Brown used “culture” and “society” as roughly equiva¬
lent terms. It was only later that he held it a mistake to view social anthropology
as dealing with “culture.” In his essay “On Social Structure” (1940) he attempted
to argue that a society is “observable” in some sense in which “culture” is not (cf.
Structure and Function in Primitive Society, p. 189 f.). It is difficult to see how this
thesis can be supported.
As I have suggested above (cf. note 25), this change in terminology, and Rad-
cliffe-Brown’s emphasis on it, may have been due, in part at least, to his desire to
distinguish his doctrine from that of Malinowski. Whether the terminological issue
was of importance for anthropology, or whether it had a merely personal basis, seems
to me an open question.
54. Cf. “Culture as a Determinant of Behavior,” in Factors Determining Human
Behavior, Harvard Tercentenary Publications (1937), pp. 137-139 and p. 146.
55. Cf. “Meaning and Scope of Social Anthropology” (1944), in Method in Social
Anthropology, pp. 103-104; also, A Natural Science of Society, p. 49.
56. In his later works, Radcliffe-Brown evinced an increasing interest in problems
of social change, although it cannot be said that he ever completely overcame his
anti-historical bias. For his later concern with problems of change, cf. A Natural
Science of Society, pp. 86—89, and Method in Social Anthropology, pp. 178—189.
To some extent, although to a lesser degree, the same may be said of Malinowski.
However, Malinowski’s concern with the problem of change seems to have been
motivated less by theoretical considerations than by the pressing practical concerns
connected with problems of cultural contact (cf. the posthumous work, .The Dy¬
namics of Culture Change, edited by Phyllis M. Kaberry).
57. The importance which Radcliffe-Brown attached to the distinction between
ideographic and nomothetic enquiries can be seen in the opening pages of the intro¬
duction which he wrote for the papers collected in Structure and Function in
Primitive Society (1952). The same sharp distinction was present in his frequent
attempts to differentiate between “ethnology” and “social anthropology.” (The latter
term he equated with “comparative sociology,” and in effect with “social science.”)
For an early example of this distinction, cf. “The Methods of Ethnology and Social
Anthropology” (1923), reprinted in Method in Social Anthropology (cf. especially,
p. 7). As that essay makes clear, one important reason for his having drawn this
distinction was his desire to free contemporary anthropology from its earlier concern
with questions about the historical origins of specific elements in a culture.
Radcliffe-Brown’s actual use of the contrast between “ideographic” and “nomo¬
thetic” seems to me fundamentally mistaken. The fact that one can distinguish
between ideographic and nomothetic statements does not justify the assumption that
these terms refer to two distinct and independent types of inquiry. Of course, con¬
fusion with respect to this point has not been confined to the methodological views
of Radcliffe-Brown.
58. I should not expect this statement to be challenged by those who know
Malinowski’s Scientific Theory of Culture. On the other hand, it may be thought
to constitute something of a caricature of the position of Radcliffe-Brown. I do not
believe that this is actually the case. Consider, for example, the following passage
from “Patrilineal and Matrilineal Succession”: “Any social system, to survive, must
conform to certain conditions. If we can define adequately one of these universal
conditions, i.e., one to which all human societies must conform, we have a socio¬
logical law. Thereupon if it can be shown that a particular institution in a particular
society is the means by which that society conforms to the law, i.e., to the necessary
condition, we may speak of this as the ‘sociological origin’ of the institution”
(Structure and Function in Primitive Society, p. 43). vSuch a sociological origin,
rather than any historical origin, is what he views it as die aim of the comparative
method to establish. A similar passage—though perhaps less unambiguous—is to
be found in Method in Social Anthropology, pp. 40-41. Cf. also Structure and Func¬
tion in Primitive Society, p. 86 f.
59. It might also be objected that the foregoing principle cannot be taken to be a
law at all, since from it one could not deduce the specific nature or the degrees of
the changes which would be entailed. However, even though this is the case, I am
not certain that these limitations should lead one to say chat a general principle of
this form would not, if true, be an empirical law.
60. In his first analysis of functionalism, Nagel suggested that this might be the
only empirically warranted interpretation of the functionalist thesis. (Cf. pp. 47-48
332 2. STRUCTURE OF SCIENCE

of his “Problems of Concept and Theory Formation,” as cited in note 1, above^


This is also the position adopted by Dorothy Emmet (cf. Function, Purpose and
Powers, pp. 81—82). „
61. Cf. Laws and Explanation in History (London, 1957), PP- 6b n.
62 In reviews of two of Malinowski’s posthumously published works, Max OlucK-
man has bitterly attacked Malinowski’s functionalism as remaining on a primarily
descriptive level. (Cf. “An Analysis of the Sociological Theories of Bronislaw
Malinowski,” The Rhodes-Livingstone Papers, Number Sixteen; reprinted from Afri¬
can Studies, March 1947, and from Africa April 1947.)
“Viewed sympathetically, Malinowski’s position regarding the explanatory func¬
tion of descriptive analyses seems to be unobjectionable. Viewed critically, however,
it appears to be circular, for if one element, a, is to be explained in terms of other
elements b and c, how is one then to explain the latter? According to Malinowski s
early functionalism, are not the natures of b and c as much dependent upon a, as its
nature is dependent upon them?” , . ^
An answer to such a charge of circularity lies in the fact that in advancing any
explanation one must take some facts as given: the field worker finds that b and c
are the case, but is puzzled by a. Thus a is explained through noting how it fits with
b and with c; or, if a were taken as given, b and c might be explained through
their relations to it. In this connection, it is especially to be noted that Malinowskis
early functionalism was not in any way concerned with questions relating to the
origins of specific cultural characteristics, but only with questions concerning in¬
fluences upon their present natures. Viewed in this light, I do not find it misleading
to regard descriptive analyses of the type which he offered as explanatory.
However, in holding this view, I do not wish to suggest that such analyses can
proceed without at least covertly using generalizations concerning psychological and
institutional factors. (Cf. my criticism of Dray in “Historical Explanation: The Prob¬
lem of ‘Covering Laws’,” History and Theory, vol. 1 (1961), pp. 229-242.) How¬
ever, there is no reason to suppose that Malinowski would have objected to this
contention.
63. For a lucid and more careful exposition of these matters, cf. Carl G. Hempel,
“The Logic of Functional Analysis,” reprinted in revised form in Aspects of Scientific
Explanation, especially pp. 308-319.
64. Cf. “Societal Laws,” British Journal for the Philosophy of Science, vol. 8
(1957), pp. 211-224. [Now reprinted in William Dray: Philosophical Analysis and
History (New York, 1966).]
It may with some justice be argued that many of the analyses of Radcliffe-Brown
did in fact establish generalizations concerning the specific relations which various
elements in societies bear to one another. I am inclined to share this view. However,
the deducibility of such relationships from his general theory of the needs of society
for cohesiveness and continuity might still be challenged.
65. Cf. the first sense of the term “function” as discussed by Nagel in The Struc¬
ture of Science, pp. 522-23. As Nagel rightly says: “The word [“function”] is widely
used to signify relations of dependence or interdependence between two or more
variable factors. . . . Such “functional” relations of dependence or interdependence
are often established by functionalists in their analyses of social processes. However,
if functional analysis means no more than this, it does not differ in aim or logical
character from analyses undertaken in any other domain with the objective of dis¬
covering uniformities in some subject matter.” If I have not been mistaken in my
interpretations of their thought, both Malinowski and Radcliffe-Brown would be
wholly in accord with this conclusion.
LEGAL THEORY: Law, Justice, Ethics, and
Social Morality1
Wolfgang Friedmann

The present essay is an attempt to appraise the interrelationship be¬


tween concepts and disciplines which are obviously closely connected. The
analysis of their interrelationship has been the subject of countless discus¬
sions, by philosophers, jurists, theologians, and others. It will nevertheless
be apparent from the subsequent discussion that recent contributions to
this age-old problem, from jurists and philosophers alike, as well as some
important recent judicial decisions, have cast new light on the nature and
function of the legal order.
In the following discussions the concept of law, as a characteristic type
of social norm, will be distinguished from that of a legal system. The
concept of justice will be surveyed in its dual relationship to law and ethics.
A distinction will be made between ethics, as a system of values governing
individual conduct, and social morality, as a system of norms which governs
the social conduct of a community.

LAW

All definitions or characterizations of law veer between two extreme


positions: One extreme emphasizes its coercive character; the other lays
stress on the social acceptance, the actual observance of law by the com¬
munity to which it is addressed. The coercive aspects of the legal norm
rest both on the source of authority (sovereign command, hierarchical
order) and on the enforceability by sanctions—which may be civil,
criminal, or administrative. Both elements are pivotal to the theories of
Austin and Kelsen. . . . Whether the “ought” aspects of the legal norm
are conceived as an actual command flowing from a sovereign (Austin) or
as a hierarchical structure of norms attributed to an ultimate “depersonal¬
ized” sovereign (Kelsen), it is the authoritative aspects of the legal norm
that are singled out as its essential characteristics. By contrast, the concept
of law as reflected by the theories of Savigny and Ehrlich emphasizes the
actual observance, the growth of customs, the “living law” of groups and
communities, as the decisive element. Such law may receive authoritative
confirmation from the sovereign, but it is not created by him. However, the
333
334 2. STRUCTURE OF SCIENCE

differences between these two approaches are relative rather than absolute;
they are essentially matters of emphasis. The “positivist” definition cannot
dispense with the acceptance of the legal norm by the community, as shown
by the inclusion, in Austin’s definition, of “habitual obedience, and, in
Kelsen’s analysis, of the “minimum effectiveness.” On the other hand, at
least in the context of a modern legal system, no social behavior, however
steady and supported by conviction on the part of the observing group, can
dispense with the recognition bestowed upon it by the legislator, the ad¬
ministrator, or the judiciary, or a combination of all these organs of the
modem state.
A third essential element in the concept of law is a degree of generality.
“The first desideratum of a system for subjecting human conduct to the
governance of rules is an obvious one: There must be rules. This may be
stated as a requirement of generality.”1® Here, as in so many other fields,
John Austin’s distinction was basically right, but too rigidly drawn: “Where
it obliges generally to acts or forbearances of a class, a command is a law
or rule, but where it obliges to a specific act or forbearance, or to acts or
forbearances which determine specifically or individually, a command is
occasional or particular.”2 In the legal systems of modern societies, legal
norms, resulting from a constant interplay of legislative, executive, and
judicial pronouncements, cover the whole spectrum from extreme generality
to great particularity. They range from constitutional bills of rights to orders
prohibiting the sale of a particular commodity in a specific district on cer¬
tain days. The relativity and even elusiveness of the distinction between
general and particular command is brought out clearly in the doctrine of the
“Stufentheorie.” ... On the other hand, it is equally obvious that a com¬
munity which had no general prescription at all, but only an infinite
multitude of individual commands, would not be regarded as having a legal
order. It would dissolve into millions of individual relationships. Taking
these three essential elements into consideration, we may say, without an
attempt at “definition” of the term, that the concept of law means a norm
of conduct set for a given community—and accepted by it as binding—by
an authority equipped with the power to lay down norms of a degree of
general application and to enforce them by a variety of sanctions.3

LEGAL SYSTEM

Modern jurists have paid increasing attention to the general concept


and minimum requirements of a “legal system” as distinct from the in¬
dividual legal norm. A legal system “constitutes an individual system deter¬
mined by ‘an inner coherence of meaning,’ ... an integrated body of
rules. . . .”4 A multitude of individual legal norms may not amount to a
legal system unless they are linked with each other in an integrated struc-
WOLFGANG FRIEDMANN 335

ture. An analysis of the minimum requirements of the legal system, which


greatly preoccupies contemporary jurists, such as Kelsen, Ross, Hart, and
Fuller, has therefore a very different perspective from the attempt to define
or characterize a legal norm in isolation. The awareness of a legal system as
a structure in which the different organs, participants, and substantive
prescriptions of the legal order react upon each other is essentially the
corollary to the increasing complexity of modern society, in which millions
of individuals depend on the functioning of a complicated network of legal
rules of many different types and the interplay of public authorities of many
different levels. In primitive societies the reach of law is generally weaker,
and the institutionalization of the legal structure much less developed than
in a more advanced and complex society.5 Not as a matter of conceptual
definition, but as a basis for meaningful inquiry into the nature of a legal
system, we may accept the suggestion of Graham Hughes6 that “for many
purposes it will be useful to reserve the description ‘legal system’ for those
types of social order characterized by a high degree of institutionalization
in the creation of general prescriptions, in the apparatus for adjudicating
disputes, and in ordering the disposition of force.” A similar conception
underlies H. A. L. Hart’s rationalization of the differences between primi¬
tive and advanced legal systems through the distinction between “primary”
values of obligation and “secondary” rules of recognition.7 Hart’s “primary
rules of obligation” correspond very closely to what is usually described as
“customary” rules. Such rules, which generally are concerned with restric¬
tions on the free use of violence, and other elementary forms of coexistence,
are adequate for primitive communities but inadequate for a more developed
society because they are uncertain, static, and inefficient. Hart’s “secondary
rules of recognition” are in effect a shorthand description for the major
aspects of a modern institutionalized legal system, which develops ma¬
chinery for the formulation of legal rules, for orderly change, and for ad¬
judication.8
It is a concept of stratified legal order which forms the model of Kelsen’s
legal analysis, and in particular of the “Stufentheorie.” This analysis reflects
a mature legal system in which there is a definite relationship between
constitution-maker, legislator, administrator, judge, and private legal sub¬
jects. But the analysis of a modern legal system will also include that of
other institutions such as the family, the church, or the trade union, which
enjoy lawmaking and sanctioning powers in varying degrees (e.gthrough
the very powerful sanction of expulsion of union members by union
procedures which, until recently, have been almost entirely immune from
control by the courts, i.e., the state).9
The range of a legal system in a modern state—which he prefers to call
“legal order”—is aptly summarized by Julius Stone10 in the following
four requisites:
First, a legal order arises in the general range of modern states, unitary
336 2. STRUCTURE OF SCIENCE

or federal and regardless of its particular ideology; second, a legal order


must somehow be distinguishable from a moral and social order; third,
the concept of law is a class concept, i.e., it must apply to the members of
a given class; and fourth, a legal order is an “experienced single entity,”
something distinct from the “individual norms which are a part of it.”11
All the above-mentioned descriptions and analyses of a legal order or
legal system have abstained from linking the essence or existence or basic
structure of a legal system with certain minimum requirements of justice or
morality. An exception is the recent attempt of Professor Fuller to deduce
eight requirements of “inner morality” of law from the very nature of a legal
system.12 These eight principles are not conceived as maxims of substantive
natural law, i.e., as a summary of the ideals inspiring a particular society
as worthy of attainment. They are instead seen as a kind of “procedural
natural law.” The eight principles are: (1) generality; (2) promulgation;
(3) prospective legal operation, i.e., generally prohibition of retroactive
laws; (4) intelligibility and clarity; (5) avoidance of contradictions; (6)
avoidance of impossible demands;13 (7) constancy of the law through time,
i.e., avoidance of frequent changes; (8) congruence between official action
and declared rule.
It is difficult to see in what respect these eight requirements do more
than spell out the minimum components of an efficiently functioning
modern legal system. As such they are applicable to any legal system, re¬
gardless of ideology, whether totalitarian or democratic. Even the one re¬
quirement that might be thought of as expressing a particular—liberal—
philosophy, i.e., the general—though not absolute—prohibition of retro¬
activity, is essential to the functioning of any legal system. No totalitarian
legal order could survive for any length of time if all or a great majority of
its laws were made retroactive; legal order would break down in confusion.
On the other hand, as Fuller himself says, democratic legal systems may
sometimes have to admit retrospective legislation. In this context, Fuller
mentions the difference between, e.g., ex post facto criminal statutes (un¬
justifiable) and retroactive tax laws that impose taxes on earnings received
before the date of their enactment (justifiable).14 The Nazi system, at the
height of its effectiveness, complied with all the eight requirements, except,
to some extent, that of promulgation.
The Nazi extermination decrees, for example—whose “orderly” and
systematic brutality has more than anything else inspired the many postwar
discussions on the invalidity of the Nazi legal system as incompatible with
basic principles of humanity—were certainly general, insofar as they ap¬
plied to a tragically large but clearly definable class of victims; they were
made known to them with cynical brutality; they were prospective, clear,
and free from contradictions; and they were, unfortunately, far from being
impossible of execution. Nor was there any lack of congruence between
the law and official action, because the exterminations took place within a
WOLFGANG FRIEDMANN 337

hierarchic structure derived from the supreme legislative authority of the


Fuehrer, It is only by means of a petitio principii, namely that barbarous
“laws” should not be qualified as such, that we can circumvent this con¬
clusion. In the earlier debate which clearly inspired Fuller’s book,15 he
maintains “that a dictatorship which clothes itself with a tinsel of legal
form can so far depart from the morality of order, from the inner morality
of law itself, that it ceases to be a legal system.” The illustration given by
Fuller, and also earlier by Gustav Radbruch, in his moving attempt to find
some criterion of illegality with regard to the greatest enormities of the
Nazi system, referred to the chaotic features of the final Nazi period, a
period when the shouts of Hitler, pronounced in epileptic fits, could be
taken as overruling previously published legal orders. Such a state of affairs
is “anarchy” in the strictest sense. It will collapse simply because it can no
longer maintain minimum legal order in a complex society. Use of the term
“morality” in connection with this galaxy of requirements is misleading.
They are “essentially principles of good craftsmanship.”16 Nor do they
offend against the characterization of law as a “purposeful enterprise de¬
pendent for its success on the energy, insight, intelligence, and conscien¬
tiousness of those who conduct it,” and which “displays structural con¬
stancies.”17 The legal system of the Nazi period, dominated by the ideology
of racial discrimination, was extremely successful, in terms of its stated
purpose: the degradation and extermination of “inferior” races, especially
of the Jews. No order in human memory has achieved its purpose more
swiftly and efficiently; none achieved a greater result in so little time, with
comparable economy (including the utilization of the remains of the victims
for the production of fertilizers). The use of the term “morality,” however
qualified, instead of the term “structure,” which characterizes the anatomy
of a legal system, regardless of ideology, is therefore unfortunate. Under¬
lying Fuller’s thesis that the institutionalized structure of a legal system as
a purposeful enterprise implies a degree of morality is his conviction that
humanity is progressing in moral insight through growing “participation in
institutional procedures” through “human beings confronting one another
in some social context, adjusting their relations reciprocally, negotiating,
voting, arguing before some arbiter. . . .”18 But this is a picture of pluralistic
society, in which groups and individuals can argue and bargain freely, not
of a society in which there is no such give and take, at least in the major
conditions of political and social life. There is also ambiguity in Fuller’s use
of the term “conscientiousness”19 with regard to the administration of
legal institutions and precepts. An Eichmann applied the norm set for him,
i.e., the efficient disposal of the greatest possible number of Jews, with
supreme conscientiousness. Hence his failure to understand why he should
be prosecuted.20 The conscience of a Nazi differs from that of a Christian
or a pacifist. He is directed by the basic values of the order he seeks to obey
and administer.
338 2. STRUCTURE OF SCIENCE

The structural analysis of a “legal system” is indeed more complex than


that of a primitive legal order. It denotes any cohesive order of norms that
purports to govern a community through the use of “posited” authority,
whether oppressive or liberal, socialist or capitalistic in character. The
structural requirements of a legal system must be accepted by the analytical
positivist as much as by the advocate of a natural law philosophy. It is only
by a consideration of the related but distinct concepts of “Justice,” “Ethics,”
and “Morality” that we can elucidate the relation of the legal order to
the values of life.

THE RELATION OF JUSTICE TO LAW AND ETHICS

Every legal system is oriented toward certain purposes which it seeks to


implement. In this sense, every legal system is of necessity a “purposeful
enterprise.” But in this universal sense the concept of justice is also of
necessity devoid of ideological content. The “justice” of a given legal system
may be a laissez-faire economy or the public ownership of all productive
enterprise; it may be a parliamentary multiparty system or a one-party state;
the system may be built upon the ideology of separation of powers or on the
subordination of administration and the judiciary to the will of the legislator.
It may aspire to the equality of all citizens or to a hierarchical structure of
superior and inferior citizens; it may implement the supremacy of inter¬
national over national law or—as is the case with almost all contemporary
legal systems—the inverse.
The classical definition of “distributive justice” is that of Aristotle:
“Injustice arises when equals are treated unequally and also when unequals
are treated equally.”21 Does this mean anything beyond the proposition,
stated earlier, that every legal order is directed toward some ideal of justice?
Certainly it cannot lead to any specific political philosophy of equality. It
is compatible with legal systems that discriminate between free men and
slaves, between blacks and whites, between “Aryans” and Jews, between
nationals and foreigners, between rich and poor, between men and women.
For all of these are “class concepts,” groups which the legal order may
consider as being equal or unequal in relation to each other. Some legal
philosophers have attempted to extract substantive meaning from the very
idea of justice, notably Stammler and Del Vecchio. Both have failed in their
attempts to extract from Kant’s “practical reason” substantive ideals of
justice. . . .
The difficulty of extracting any substantive principle from Aristotle’s
“distributive justice” is increased by the fact that the number and types
of classes which a legal order can establish for purposes of differential
treatment is almost infinite. For this or that purpose property owners may
WOLFGANG FRIEDMANN 339

be distinguished from tenants, local residents from out-of-town residents,


high school graduates from elementary school graduates, and so forth. It is
common sense and the practical needs of administration rather than prin¬
ciples that tend to limit class distinctions. Yet, there is a procedural
residuum in the notion of “equality for equals” which makes it more than
a meaningless formula. It implies a minimum machinery of justice, some
procedure for the determination of treatment as equal or unequal in a
particular case. This carries the implication of a third party procedure
and thus some minimum concept of “due process.”22 Thus Ginsberg is
right23 in deducing from the concept of “general” justice “the control of
power relations and more particularly the exclusion of arbitrary power.”24
Some concept of impartiality is inherent in the very process of determination
of equality, even in the most hierarchical society. But since there are no
theoretical limits to the ways in which a particular legal order chooses to
determine and subdivide classes of “equals,” “arbitrariness” can become a
very elusive criterion. When Ginsberg deduces that because “differential
treatment requires justification in terms of relevant differences,” arbitrary
discriminations, “such as those based on race, color, religion, sex,” must
disappear and “equality in political rights is extended to equality in social
and economic rights,” he slides from a formal and procedural notion of
equality to a substantive and political one. To one legal order differences of
race, color, religion, sex, or wealth may appear “relevant” and far from
“arbitrary,” whereas to another they may not.25 Nor has history shown a
continuous evolution in the sense indicated by Ginsberg. While the belief in
steady progress toward democracy and equality was popular in the nine¬
teenth century, the twentieth century has shown, in the most brutal manner,
how powerful the philosophies of racial, national, and religious discrimina¬
tions are. Concepts of justice contrary to the democratic idea of equality
govern the majority of the world’s states, while in the relations between
nations equality has barely begun to compete with the ideologies of national¬
ism, racialism, and power politics. Another attempt—similar in result,
though not in method, to that of Ginsberg—to demonstrate the possibility
of rational justification of values of justice, is that of Chaim Perelman.
In his Idea of Justice26 Perelman had demonstrated that the only factor
common to various conceptions of justice was “formal justice.” This con¬
sists in equality of treatment for all the members of one and the same es¬
sential category. “The only requirement we can formulate in respect of a
rule is that it should not be arbitrary, but should justify itself, should flow
from a normative system.”27 But, “the only claim one could rightfully make
would consist in eliminating everything arbitrary save what is implied in
affirming the values at the basis of the system.” In other words, no values
and aspirations can be rationally deduced, the ultimate values and aspira¬
tions themselves are nonrational. This approach is essentially similar to
340 2. STRUCTURE OF SCIENCE

Aristotle’s concept of distributive justice, and to the more modern philosophy


of relativism as developed by Max Weber, Gustav Radbruch Kelsen, and
others. More recently, Perelman has, however, attempted to find some ra-
tional justification for values and norms. In a work devoted to the theory o
argument28 Perelman took the Topics of Aristotle as a starting point tor
the use of dialectical, in contrast to analytical, proof of legal argument.
Juristic argument aims at justification, not demonstration, of truth. I e
justification of an action, a kind of behavior, or a decision is not concerned
with truth or falsehood. Justification can deal only with debatable things.
It is concerned with arguments of morality, legality, regularity, usefulness,
or opportuneness.29 It follows that “assertions which represent the
systematic formulation of an ideal cannot be judged the way we ju ge
factual judgment. Their role is not to conform to experience but to furnish
criteria for evaluating and judging experience and, if necessary, for dis¬
qualifying certain aspects of it.”30 How can legal argumentation be made
“rational”? Perelman’s answer is that, in conformity with Kant’s categorical
imperative, the characteristic of rational argumentation is the aim for
universality”; its postulates must be “valid for the whole of the human
community.” This does not mean that the criteria and values of rational
argumentation constitute “absolute and impersonal values and truths.
Rather do they express “the convictions and aspirations of a free but
reasonable man, engaged in a creative, personal and historically situated
effort: that of proposing to the universal audience as he sees it, a number
of acceptable theses.”31
The difficulty with this approach is that we must assume there is a
“universal audience” which shares common values. This is possible on the
basis of the philosophy of the Stoics, which appeals to universal reason, or
of the scholastics who deduce the rightness of human institutions from the
will and reason of God. In other words, it must base itself on a natural law
philosophy. On any other assumption, the universality of the audience dis¬
solves itself into a number of conflicting values, ideas, and policies.32
The impossibility of deriving specific legal ideals from the “sense of
justice” is expressed by another contemporary jurist33 in the distinction
between “justice” and “justness.” Varied as these judgments on justness will
“always be, they respond to emotion which, insofar as language permits
a verification, . . . flows from a sense of morality” or justice or value.
Ehrenzweig, basing himself on the work of Bienenfeld,34 looks to psy¬
choanalysis for the answer to the various concepts of justice, which “follow
the youth through adolescence to adulthood when their conflict will deter¬
mine and threaten the very coexistence of families, communities, and
nations. Yet they coexist. If in need he leans toward communism; if efficient
to socialism; if attacked to conservatism; and if attacking to liberalism
[and] if in the nationalist dogma of the nursery that is exhibited in the
instinctive presentation of a united front against the outsider.”35
WOLFGANG FRIEDMANN 341

The conflict of values is here transferred from the outside world to the
psyche of the individual.
We conclude that “justice,” as a generally valid concept, is formal, in the
sense that it is the goal to which every legal order aspires as a “purposeful
enterprise,” and procedural, in the sense that the Aristotelian notion of
“equality for equals” implies a minimum machinery of justice and third
party determination.
Beyond this, it is necessary to turn to the field of ethics and morality for
a determination of the values that may give the idea of justice a specific
substantive content.

ETHICS AND SOCIAL MORALITY

The great majority of writers use the terms “ethical” and “moral” inter¬
changeably.36 Although the choice of terms is largely a matter of preference,
it is submitted that the distinction recently suggested by P. F. Strawson
between social morality and ethics is more than a matter of terminology,
because it clarifies the relationship of individual values to those of the
social and, from there, the legal order. In an article entitled “Social Morality
and Individual Ideal,”37 Strawson suggests that “the region of the ethical
. . . is a region of diverse, certainly incompatible and possibly practically
conflicting ideal images or pictures of human life. . . .”38 Ethics is thus the
sphere of ideal forms of life set by individuals for themselves. It is further
implicit in his suggestion that these ideal images of man’s life—generally
called values—conflict, and that “the multiplicity of conflicting pictures is
itself the essential element of one’s picture of man.” By contrast, the sphere
of morality denotes “rules or principles governing human behavior which
apply universally within a community or class.” A “minimal conception of
morality” limits itself to those rules which are “a condition of the existence
of society,” whereas a more comprehensive conception of morality would
embrace the entire body of rules governing a community or class.
The merit of this approach is that it illuminates the tripartite relation
between (a) the values that individuals, as conscientious and responsible
human beings, set to themselves, (b) the moral norms governing a society
—which reflect a social balance and choice between conflicting individual
values, and (c) the legal order, which must reflect the current social
morality but is far from identical with it. In the completely masterminded
and conditioned society depicted in Huxley’s Brave New World or Orwell’s
1984, the distinction between social morality and individual ethics, and
ultimately that between law, morality, and ethics, might altogether dis¬
appear. The norms of social behavior would be set by “Big Brother,” who
exercises complete legislative authority, and whose lawmaking power is
342 2. STRUCTURE OF SCIENCE

used to direct and control every aspect and corner of social behavior
The individual in turn is conditioned to accept the soc.olegal norms a
controlling his entire life and precluding the formation of individual values
—which in Orwell’s terminology would be “ungood.”
In contemporary societies the relative spheres of law, morality, and ethics,
as defined here, differ, of course, considerably. But in every contemporary
society there is some tension between these three orders of conduct. In the
pluralistic and relatively individualistic society which characterizes the value
system of modern democracy, the tension between these three spheres is
and must be considerable. There is liberty left for the individual to form
and live by one of the many conflicting “pictures of life”; this is limited
by the many constraints of social morality which flow from the necessities
of social life as well as by the ideological restraints imposed by society on
the individuals living within it. Finally, there is an increasingly active
reciprocal interrelationship between the legal and the moral order. On the
one hand, moral values press upon the legal system, and on the other,
the modern lawmaker can to an increasing extent influence and modify the
social habits of the community.
Another advantage of the tripartite classification would appear to be
that it bypasses the ancient and rather age-worn characterization of law
as being concerned with external conduct and morals as concerned with
internal conduct. This is, of course, the classical distinction drawn by Kant
in the Critique of Practical Reason, which has been adopted by many
moral and legal philosophers.39 Clearly such a distinction, even if generally
valid, is greatly dependent upon the reach of law in society, which is vastly
greater in the fairly concentrated and manipulated modern society than it is
in primitive communities. A legal system that makes punishment or civil
obligation dependent upon malicious intention or capacity to control one s
actions reaches into the inner mind of man, and modern psychology has
refined and enlarged the interrelationship between the inner workings of
the mind and external conduct. Even more barren is the converse proposi¬
tion that morality is only concerned with internal conduct. The distinction
between ethical, i.e., individual value judgments, and social conduct, i.e.,
morality, helps to clarify this matter. Instead of the watertight and artificial
division into three distinct spheres, we should think of a fluid interrelation¬
ship, variable with regard to the separation and interpenetration of the three
spheres according to the character of the society in question.

ETHICAL THEORIES AND VALUATIONS

There have been innumerable classifications of ethical theories.40 The


most important and recurrent division is that according to the sources of
knowledge of ethical values, into naturalistic, intuitionistic, and noncogni-
WOLFGANG FRIEDMANN 343

tive. Very briefly, naturalism (a term coined by G. E. Moore in his Principia


Ethica, 1903) denotes “any view that holds that ethical properties can be
analyzed into or defined in terms of natural ones.”41 Intuitionism holds that
ethics is an autonomous discipline with its own peculiar subject matter. In
contrast to the naturalists, “intuitionists” believe that the basic propositions
of normative ethics are intuitive or self-evident insights of a unique kind,
which cannot be inferred from any other discipline.42
The link between these two types of ethical theory—which can be
almost infinitely subdivided—is that they hold ethical values to be capable
of objective determination. By contrast, the “noncognitive” theories regard
ethical values as incapable of any objective analysis because they are purely
emotive or at least not verifiable.43

ETHICAL AND LEGAL THEORY

As will be apparent from the following observations, all three types of


theory have had considerable influence on legal theory and correspond to
distinct types of jurisprudential thinking. The perspective from which ethical
theories are best considered in the light of legal theory is that of “validity.”
As a convenient point of departure we may take a definition of “validity”
that embraces both formal and substantive theories, by abstracting the
validity of a legal system from the content of its basic norms. According
to Kelsen44 “a legal norm is valid . . . because it is created in a specific
manner, ultimately determined by a presupposed (vorausgesetzt) basic
norm.” In the formulation of Alf Ross, “a system of norms is ‘valid’ if it is
able to serve as a scheme of interpretation for a corresponding set of social
actions, in such a way that it becomes possible for us to comprehend the
set of actions as a coherent whole of meaning and motivation, and within
certain limits to predict them.”45 From this perspective it may be convenient
to divide ethical theories into those that postulate the objective validity of
the ethical postulates and those that deny such objective validity. In the
former category there are:
(a) The type of ethical theory that is based on metapositive values, either
of a religious or a nonreligious order—in legal theory this type of ethical
approach is reflected in the main body of natural law philosophy, whether
of a theological or a rationalistic character.
(b) Those postulate ethical values of an objective, and therefore com¬
pelling, but instinctively felt character—to this approach there corresponds,
in legal philosophy, the type of theory that bases postulates of justice on a
Rechtsgejiihl (Krabbe), or a sentimento giuridico (Del Vecchio), or
“intuitive” law (Petrazhitsky). A more rationalized version of this approach
is Edmond Cahn’s “sense of injustice.” In this category there is also Geny’s
“creative intuition” as a source of legal evolution through juristic action.
344 2. STRUCTURE OF SCIENCE

The philosophical godfather of most of these theories is the French philoso¬


pher, Henri Bergson, whose celebrated work Evolution Creatrice has been
one of the most influential of the present century.
(c) Empirical theories—A contemporary American philosopher has de¬
scribed empiricism in the following terms:

It is characteristic of “empiricism,” as a philosophical tradition, to assume that


we have certain criteria of evidence, or that we can identify a certain “source”
of our knowledge, and then to apply these criteria, or to refer to this source,
and thus determine what it is that we can know.46

In this wide sense, “empiricism” is contrasted with all theories that derive
principles of ethical conduct from a priori metaphysical premises. In this
wide sense, “empiricism” corresponds, in legal theory, to “positivism,” and
metaphysical ethical theories are reflected in legal idealism. In ethical
theory, empiricism, to a large extent, coincides with naturalism, since one
of the fundamental theses of the latter is that “the truth or falsehood of
ethical sentences is established by methods of experimentation and ob¬
servation characteristic of the natural sciences.”47
From the perspective of relevance to corresponding legal theories, we
may distinguish three major types of ethical empiricism: First is the ap¬
proach that derives ethical maxims from historical and social experience.
Second is the approach that tests ethical values in the light of social facts
and realities. This type of empiricism, commonly known as “pragmatism,”
is a specifically American contribution to modern philosophy and ethics.
It is linked with the names of Charles Saunders Peirce and John Dewey.
Third, there is “logical positivism,” i.e., the approach to philosophical state¬
ments which excludes from scientific study anything that is not “verifiable”
by either logical deduction or experimental observation. In ethical theory
this approach leads to “noncognitivism.”. . . Noncognitivists exclude ethical
maxims from scientific enquiry as being essentially “emotive” and not sub¬
ject to scientific verification.

LEGAL THEORIES BASED ON OBJECTIVE ETHICAL CRITERIA

In jurisprudence, the first approach is represented by those theories that


regard certain basic principles of conduct as essential to a satisfactory legal
order, not as a matter of a priori postulates set by God or reason (natural
law) but as a matter of social experience. Generally we can group under this
approach all the social contract” theories, which are predicated on the
assumption that men need to restrain their appetites for violence, greed, and
domination, in order to achieve a minimum of mutual protection and secu¬
rity. A logical continuation of the “social contract” approach could be
Kant’s categorical imperative and its derivative definition of law as “the ag-
WOLFGANG FRIEDMANN 345

gregate of the conditions under which the arbitrary will of one individual
may be combined with that of another under a general inclusive law of free¬
dom.” But in Kant’s philosophical scheme these principles are not empirical;
they are given a priori, as an essential basis of man’s volition as a free and
rational being.
Similar or even identical ethical postulates can thus be derived either
from a priori judgments or from empirical observations. A corollary to this
duality of approaches in ethical theory can be found in legal philosophy.
Thus, throughout the long history of natural law philosophy, many postu¬
lates such as the absolute integrity of private property, the supremacy of
the lawmaking authority of the church over the state, or vice versa, the
equality or inequality of men, nations, or races, and many more, have been
deduced from metaphysical principles of a God-given universe or universal
reason. But a contemporary jurist48 has formulated five principles of what
he describes as “the minimum content of natural law,” not as a priori prin¬
ciples but as necessary to “the minimum purpose of survival which men have
in associating with each other.”49 They are thus essentially a continuation
and modernization of the “social contract” philosophy. The five principles
are: (1) human vulnerability, which makes it necessary for a legal order to
restrict the use of violence; (2) approximate equality, which makes it neces¬
sary for a legal system to develop rules of mutual forbearance and compro¬
mise; (3) limited altruism, which makes necessary some provisions to
restrain tendencies to aggression;50 (4) limited resources, which makes
necessary some system of exchange or joint planning of services and goods;
and (5) limited understanding and strength of will, from which follows
the need for a system of voluntary cooperation in a coercive system. Thus,
maxims of conduct which many of the older natural law philosophers have
presented as flowing from the immutable nature of man are here presented
as having been shown by experience and history to be necessary to the
survival of man in civilization.

PRAGMATISM IN ETHICS AND LAW

Pragmatism, as a particular type of empirical philosophy, has had a direct


and traceable influence on modern jurisprudence, in the American realist
movement. ... Its intellectual fathers are William James and John Dewey.
The characteristic feature of Dewey’s pragmatism, as applied in the realist
movement, is the method of enquiry. An enquiry into an ethical proposition
may start with the formulation of a value hypothesis;51 but this value postulate
is only provisional and has to be tested by the means of its possible realization.
A study of such means—which include the legal, social, and economic en¬
vironment of a society—may influence and modify the value postulate. A
convenient illustration of this approach might be the question of prohibition
346 2. STRUCTURE OF SCIENCE

of alcohol,52 which deeply influenced American legal, economic, and social


life for more than a decade after the First World War. Absolute prohibition
could be stated as a value goal. Means of its execution consist in the ap¬
propriate constitutional amendments, statutory prohibitions, administrative
regulations, and the policing of the legal prohibitions. An enquiry into the
means of execution may show that the purported enforcement of prohibition
leads—as in fact it did—to a vast increase in the consumption of illegal and
often lethal alcohol, bootlegging, gang warfare, murder, and a general in¬
crease in criminality. The results of such enquiries may lead to an abandon¬
ment or the modification of the original value postulate. Abandonment of
the ethical postulate, in the light of practical experience, is expressed in the
repeal of the constitutional amendment in the U.S. Constitution. An alterna¬
tive solution is that of the institution of state-controlled liquor boards, which
prevails in Canada.

ETHICAL AND LEGAL THEORIES DENYING OBJECTIVE


VALIDITY

Whereas all the previously mentioned types of theory assert that ethical
values can be objectively ascertained—whether they be deduced from nat¬
ural law foundations, from a priori principles predicated on the rationality
of man, from empirical data based on history, from generally accepted prin¬
ciples of good and right, or from pragmatic enquiry, the “nonobjective”
theories of ethics deny that ethical values can be objectively ascertained. To
them, ethical values are a matter of conviction. They must be believed in
but cannot be proved.

RELATIVISM IN ETHICS AND LEGAL PHILOSOPHY

There are, however, two major types of this kind of approach to ethical
norms. One trend of thought is best described, in ethics as in jurisprudence,
as “relativism”; the other is “noncognitivism.” While both agree on the
nonprovability of values, the relativists believe that rational argument can
and must support the choice of a particular value, by comparison with, and
often in opposition to, another value. The noncognitivists reject all study of
ethical values as purely emotive and therefore not within the realm of
science. Probably the most influential of the relativists in modern times is
Max Weber. . . . His most important disciple in the field of legal theory is
Gustav Radbruch, whose legal philosophy—which in recent years has be¬
come the subject of attention far beyond the borders of Germany and
Europe—is a profound and important application of the relativistic ap¬
proach. For it is not content to state the antinomies of legal ideas but follows
WOLFGANG FRIEDMANN 347

the major antithetic values of legal philosophy into particular legal insti¬
tutions and concepts.
Among contemporary ethical philosophers, we may list as relativists
Dewey, Russell, and Ginsberg. Dewey’s entire work53 is permeated by the
thought that value statements are prescriptions or recommendations for
action based on alternative convictions but that “it is morally necessary to
state grounds or reasons for the course advised and recommended. These
consist of matter-of-fact sentences reporting what has been and now is, as
conditions, and of estimates of consequences that would ensue if certain of
them are used as means.”54
Bertrand Russell has, in his many writings, veered from the conviction
that ethical statements are purely emotive to one that holds that truth can
be discovered by the use of reason. The former view is expressed, for exam¬
ple, in his Science and Religion,55 where he says that: “Since no way can
be even imagined for deciding a difference as to value, the conclusion is
forced upon us that the difference is one of tastes, not one as to any ob¬
jective truths.” But in his more recent work on Human Society in Ethics
and Politics of 1954, Russell stresses the role of reason.
Reason has a perfectly clear and precise meaning. It signifies a choice of
the right means to an end that you wish to achieve. It has nothing whatever to
do with a choice of ends. But opponents of reason do not realize this, and
think that advocates of rationality want reason to dictate ends as well as
means.56

The essence of his reasoning57 is that a concept such as “good” has an in¬
trinsic value of its own. Intrinsic value is “the property of being a state of
mind desired by the person who experiences it.” “Good” is the property of
arousing an emotion of approval, “bad” that which arouses disapproval.
Thus “good” is linked with pleasure or desire on the part of the actor and
approval on the part of the community. “Right” conduct is that which is
likely to produce “good” effects. This position is very close to that of Henry
Sidgwick, who in his Methods of Ethics argued that generally recognized
moral rules can be deduced from the principle that we ought to aim at maxi¬
mizing pleasure. Russell’s later theory thus comes close to that of the
“naturalists.”
Like the former, and like that of the utilitarian legal philosophers, . . .
this approach is full of ambiguities. Even if we accept that the “good” is the
end desired by an individual, the test of “approval” is highly ambivalent.
Does it mean approval by the entire community, by a select avant-garde,
approval by the greatest number or by the wise? Is not the highest ethical
conduct sometimes that which is in revolt against the majority and there¬
fore arouses intense disapproval? Ethical pioneers, like the early Christians,
the early fighters for women’s rights or labor organizations, or pacifists,
have had to pursue their values in intense opposition to the overwhelming
majority of their fellow countrymen. A similar criticism can be raised against
348 2. STRUCTURE OF SCIENCE

the essay by S. E. Toulmin,58 which accords with Sidgwick and Russell in


the assertion that “to say that X is right is to say that X is worthy of ap¬
proval.”59 For Toulmin, to be “worthy of approval” means that “there is a
valid reason for approving X.” It is not the fact of acceptance but the
worthiness of acceptance which gives an argument validity. But here the
same question arises, namely, what are the criteria for worthiness of ap¬
proval? One answer given by Toulmin90 is that there is a valid reason for
doing something when it can be shown to be in accordance with rules or
maxims accepted in our society. Where there is conflict between rules or
norms within a society, it is reasonable to appeal to an “overall” principle,
that “preventable suffering shall be averted.” “The notions of ‘obligation,’
‘right,’ ‘justice,’ ‘duty,’ and ‘ethics’ apply in the first place when our actions
and institutions may lead to avoidable misery for others; but it is a natural
and familiar extension to use them also when the issue concerns a chance of
deeper happiness for others and even for ourselves.”61 This appears to
gloss over the deep conflicts which may arise between conflicting values,
and in particular between the ethics of an individual and the prevalent
morality of a society. Here we can see again the value of the distinction
between individual ethics and social moralities which was adopted at the
outset of this section. There simply is no necessary equivalence between the
avoidance of misery, happiness for others, and happiness for oneself. These
emotions or purposes may coincide, but they may also starkly conflict with
each other.
The value of the aforementioned attempts does not lie in their somewhat
simple version of utilitarianism, but rather in the emphasis on the use of
reason and rational argument in the clarification of ends. This appears to
be the position of Morris Ginsberg,62 who describes the task of ethics as
being: “(a) to bring out what is implied in the notion of a norm or principle
of action, and (b) to survey the major or dominant goods or values and
the norms or injunctions they entail.”63 Ginsberg accepts that there is a
plurality of values and of conflicting ways of life. He does, however, assert
that “there is ... in every society a general framework, the maintenance or
furtherance of which comes to be conceived as an overriding obligation,
though this may come into conflict with the demands of particular ideals.”64
It is in an attempt to elaborate the principal legal values of a contemporary
Western-type society rather than absolute values that Ginsberg, in subse¬
quent chapters, elaborates such matters as economic and political rights, the
modern status of association and contract, the ethics of punishment, and
freedom of thought. How important rational arguments can be, not in the
determination of ends, but in the clarification and concretization of given
values, as applied in the legal life of a community, may be illustrated by
the problem of freedom of contract. A general catalog of basic rights and
values in a democratic society is likely to enumerate both “freedom of con¬
tract” and “equality” as values to be protected by the law. As long as these
WOLFGANG FRIEDMANN 349

values remain abstract and general, there appears to be no contradiction.


But when we follow the implications of freedom and equality of contract,
in the context of modern industrial society, such conflicts and tensions be¬
come readily apparent and may compel the subordination of one end to
another. In the earlier stages of industrial capitalism, freedom of contract
led to an increasing inequality between the entrepreneur and the unorganized
worker, due to stark differences in their economic power. This led to a
countermove to the organization of trade unions, which increasingly, through
collective organization, compensated for the weakness of the individual
worker. In contemporary industrial society, unions tend to face employers
as equals but at the expense of the freedom of bargaining of the individual
worker, who has surrendered it for the sake of equality, expressed in better
terms and the improvement of his standard of living. Individual freedom of
contract here gives way to the more important goal of economic equality.
The two values cannot be implemented simultaneously.65
Thus it is only by following basic ethical values into their implementation
in a given social content that their true meaning and ranking can be ascer¬
tained. This is indeed a matter of reasoning, supported by factual data.86 It
is only with this qualification that we may accept the attempts of such
writers as Russell, Dewey, Perelman, Ginsberg, and Toulmin to emphasize
the place of rational argument in the ascertainment of values.

NONCOGNITIVIST ETHICAL THEORIES

David Hume is generally regarded as the father of ethical noncognitivism.


His celebrated statement that reason “is and ought only to be the slave of
the passions and can never pretend to any other office than to serve and obey
them” not only undermined the foundations of natural law but also implied
that reason is essentially the servant of emotions, which latter set the goals
of action. But Hume may also be cited in support of the relativist position
since, in his Enquiry Concerning the Principles of Morals,67 he says:

The hypothesis which we embrace is plain. It maintains that morality is


maintained by sentiment. It defines virtue to be whatever mental action or
quality gives to a spectator a pleasing sentiment of approbation; and vice the
contrary. We then proceed to examine a plain matter of fact, to wit, what
actions have this influence; we consider all the circumstances, in which these
actions agree; and thence endeavour to extract some general observations with
regard to these sentiments.

Be that as it may, modem ethical noncognitivists have asserted emphat¬


ically that normative concepts are purely emotive. For the most radical of
the noncognitivists, A. J. Ayer,68 all normative words are “pseudo-con¬
cepts.” All genuine concepts must be either empirical or logical. The em¬
pirically verifiable and the logically certifiable exhaust the cognitive
350 2. STRUCTURE OF SCIENCE

dimensions of meaning. Everything else, such as a simple command, a blush,


a yawn, but also words like “good,” “bad,” “ought,” “worthy,” are purely
emotive, and there cannot be such a thing as ethical or moral science.
An important modification of this position is Charles L. Stevenson’s in¬
fluential Ethics and Language (1944). Whereas Ayer argues that “what we
do not and cannot argue about is the validity of these moral principles. We
merely praise or condemn them in the light of our own feelings,”69 Steven¬
son distinguishes between attitudes and beliefs. Only disagreements in
attitude—which comprise the basic values—are genuine and irreducible.
You cannot argue about purposes or preferences. But a value judgment
such as that a certain person is “good” has a complex meaning which is
partly emotive and partly cognitive. A full picture of ethics must recognize
both factors. To illustrate his position, Stevenson gives as an example the
choice before trustees for the estate of a philanthropist; the trustees have
been instructed to forward any charitable cause that seems to them worthy.
They argue as to whether to provide hospital facilities for the poor or to
endow universities. The choice between these alternatives is an irreducible
choice between different “attitudes.” But “the discussion is almost certain to
involve disagreement in belief. Perhaps the men will disagree . . . about
the present state of the poor, and the extent to which hospital facilities are
already provided for them. Perhaps they will disagree about the financial
state of the universities, or the effects of education on private and social
life.”70 On the latter type of question, agreement can be reached through the
investigation of facts, which may confirm the one or the other position.
In essence, this position is very close to that of Dewey’s pragmatic “logic
of enquiry.” The need to test legal values in the light of reality, by factual
evidence, would be regarded by contemporary lawyers as almost too trivial
to require demonstration. In part this is due to the efforts of the American
legal realists, who themselves are strongly influenced by Dewey. But it is
today—and has been for some time—part and parcel of the administration
of justice in modern society. An outstanding and familiar example is the
so-called “Brandeis brief.” . . . This brief consisted of a short “value state¬
ment,” i.e., the proposition that an Oregon statute fixing a ten-hour maxi¬
mum day for women was in accordance with the constitutional values
embodied in the Fourteenth Amendment, and a very elaborate factual brief
as to the conditions actually prevailing in the relevant industries and their
relevance to the state of women’s health, safety, and morals.
Among contemporary legal theorists, Alf Ross is clearly a “noncogni-
tivist.” For Ross71 such terms as “just” or “unjust” are entirely devoid of
meaning. They are merely expressions of like or dislike. “To invoke justice
is the same thing as banging on the table, an emotional expression which
turns one’s demand into an absolute postulate. ... It is impossible to have
a rational discussion with a man who mobilizes ‘justice,’ because he says
nothing that can be argued for or against.” It is clear that this position differs
WOLFGANG FRIEDMANN 351

sharply from that of relativists like Radbruch or Ginsberg, for whom there
is very much to argue about in respect to conflicting values, even though
they accept that the ultimate ends cannot be proved but must be believed
in. Although Kelsen could also be described as a “noncognitivist” for the
purposes of legal science, insofar as he denies that the prescription of values
of any kind, including ethical ideals, can be the proper subject of legal
science, he regards the cognition (Erkenntnis) and description (Beschreib-
ung) of the law as a system of norms constituting legal values as the proper
realm of legal science.72 But such values are, for Kelsen as for Radbruch,
relative not absolute.
The influential contemporary school of Oxford “ordinary language phi¬
losophers” is sometimes linked with the ethical noncognitivists.73 But it
would appear that the Oxford philosophers’ emphasis on the analysis of the
meaning of language has no particular ethical connotation, positive or
negative. It seeks to elucidate the meaning of legal terms and concepts in
the context of legal language.74 An illustration is H. A. L. Hart’s “Defini¬
tion and Theory in Jurisprudence,”75 where he investigates the meaning of
such concepts as a legal right or corporate personality in the legal context
in which they are used. By contrast, Hart’s principles of “minimum natural
law” are, as shown earlier, not connected with the “ordinary language philos¬
ophy” but state an ethical philosophy of an essentially empiric character.

ETHICAL THEORIES AND THE SOLUTION OF LEGAL PROBLEMS

It may be useful to test both the relevance of different ethical theories


and the differences between them by applying them to the solution of a
legal problem with deep ethical implications. No contemporary problem
has shaken the consciences of lawyers more deeply, and revealed more
pungently the tension between conscience and legal order, than the problem
of disobedience to Nazi laws—a problem that could be transposed to other,
comparable situations of conflict between an inhuman legal order and the
ethical conscience of the individual. This problem has been at the heart of
the revival of natural law thinking after the Second World War; it has
inspired the postwar thinking of Gustav Radbruch; and it has been the main
subject of a now famous debate between Professors Hart and Fuller.
How would the different ethical theories approach the problem posed for
an officer or a civil servant by an order to draft one of the Nazi extermina¬
tion decrees or to organize a transport of Jews to an extermination camp?
How would they react to the problem of the “informer wife” who, utilizing
a wartime decree authorizing—or perhaps commanding—the denunciation
of family members for utterances hostile to the regime, volunteered infor¬
mation to a special tribunal about anti-Hitler utterances made by her hus¬
band within the four walls of the home, refusing the even then existing
352 2. STRUCTURE OF SCIENCE

privileges of the wife not to testify against her husband because she wel¬
comed this opportunity to dispose, under the cover of legal authority, of
her husband and to carry on her own love affairs?76
The first possible approach is that of transcendental or “supernatural”77
ethics, which corresponds to the orthodox natural law approach in legal
philosophy. This approach would regard the type of decrees that led to
Auschwitz and Belsen as contrary to a natural law of respect for human
dignity, as an emanation either of the law of God or of universal reason. It
would conclude that a law clearly offending against these elementary prin¬
ciples was void and therefore not binding. From this premise flows the right
to punish those who offended the higher law by obeying the positive law.
In technical terms, this means that a subsequent legal order such as that
expressed by the Nuremberg Charter or by postwar German legislation is
made applicable retroactively.
A second approach would be that of the intuitionist ethics. The rightness
or wrongness of a conduct would be determined by an objectively but in¬
tuitively known feeling of right or wrong, a Rechtsgefuhl or a sentimento
giuridico. The difficulty with this approach is that an intuitive evaluation
can lead the individual concerned to very different decisions. He may in¬
tuitively feel the wrongness of an extermination decree and derive from this
his duty to disobey it, or he may on the contrary accept the injunction of
the Nazi law of 1935 which empowered judges to inflict punishment “in
accordance with the sound instincts of the people,” interpreting such sound
instincts as dictating the persecution and even extermination of Jews, Slavs,
and other “inferior” races. Or he may be inspired by the feeling: “right or
wrong, my country.” Intuition may help to inspire marginal decisions in the
sense indicated by Geny, but if asked to guide in the basic choice of values
it yields nothing.
Third, there are the various relativistic approaches. One of these, that
of Dewey, would be based on a pragmatic “logic of enquiry, directed to the
exploration of a given value.” Such an approach would tentatively appraise
the Nazi laws that “legalized” racial oppression, degradation of the human
personality, and mass murder as evil. It would, however, study the question
of subsequent punishment of those who obeyed the Nazi laws in the light
of feasibility. Such a study might show that a complete implementation of
the goal of punishing everybody who participated in the making and
execution of such laws was simply not feasible.78 The result of such a prag¬
matic enquiry might be that a more modest goal, i.e., the selection for
punishment or other sanctions of those prominently associated with the
Nazi regime through their high position or known deeds, would implement
more adequately the objective of disapproval of the Nazi values and of
treating equals equally.
While pragmatic ethics are compatible with a relativistic approach, the
basic attitude of relativistic ethics would be that whether to obey or dis-
WOLFGANG FRIEDMANN 353

obey the Nazi laws was essentially a question of choice between the religious,
humanistic, hedonistic, and other values relevant to the problem. One pos¬
sible value—which indeed was chosen by the great majority of Germans—
was that of obeying the positive authority of the state, at the expense of the
principles of human dignity, compassion, and charity. One of the possible
values would be that of racial differentiation and inequality, approving dis¬
criminatory treatment against Jews, gypsies, Slavs, and other “inferior”
races. But the rationalistic ethics that is usually combined with the relativ¬
istic approach would demand a careful study of the means by which the
different values would have to be implemented. It would show, for example,
that the necessary implication of legal discrimination between “Aryans”
and Jews would lead not only to the undermining of the family but also to a
profound modification of the principles of equality, in contract, in criminal
law, and in other fields. Such clarification of the goals might at least articu¬
late and underline the severity of the choice between values. A relativistic
approach could, however, hardly lead to subsequent punishment of those
who obeyed the law, as a matter of legal justice. It could hardly, consistent
with its own approach, declare Nazi legislation as having been nonexistent.
It could justify subsequent punishment only as a matter of considered retri¬
bution, or a necessary stage in the rehabilitation of the German people, or
in the evolution of international law, or some other political objective.
“Noncognitivist” ethics would dismiss the entire problem as beyond the
reach of rational discussion. It would regard the punishment of Nazi
criminals, or their nonpunishment, as expressions of conflicting emotions,
be they the retribution imposed by an outraged humanity, a sophisticated
version of the traditional exercise of the rights of victors over vanquished, or
on the other hand a skeptical or even cynical acquiescence in the man’s
cowardice.
The solution [I have] suggested ... for these problems is predicated on the
belief that no legally compelling solution can be found for this type of prob¬
lem. Whatever the technical device, a subsequent and differing set of values
has to be substituted for the values governing the offensive action. Whether,
and to what extent, to punish the soldier, the civil servant, or the “grudge
informer” for the actions described earlier is a metalegal problem. The
ethics common to all the theories sketched earlier is that predicated on an
individual freedom of choice. The ethical question of punishment or other
sanctions must therefore be predicated on the degree of freedom of choice
between alternative courses of action. This indicates leniency for those who
acted under the compulsion of the duty of a solidier or of a civil servant and
sterner treatment for those who, like the informer wives, in fact voluntarily
chose one course of action over another. Perhaps this approach may be said
to be close to an existentialist position: “To accept an ethical judgment is
to commit oneself to a course of action or a way of life, and it involves an act
of choice in which one is at least partly on one’s own, anxious but free.”79
354 2. STRUCTURE OF SCIENCE

Perhaps the most important, though disillusioning, lesson to be drawn


from the above survey is that, ultimately, all ethical theories, however dif¬
ferent in methods and goals, point to the alternatives of, on the one hand,
individual decision based upon the free and responsible weighing of alterna¬
tive courses of action and, on the other hand, submission of individual ethics
to the dictates imposed by superior authority.

SOCIAL MORALITY AND THE LEGAL ORDER

Although, following Strawson, we have distinguished individual ethics


from social morality, it is obvious that there is no complete separation be¬
tween the two. The social morality of a community will be determined by
the balance of the thousands or millions of individual ethical “pictures of
life” within it. This will not, of course, be necessarily an arithmetical
median. The relative impact of the multitude of individual ethics upon social
morality—and in turn, the impact of social morality upon the legal order—
will greatly depend upon the character of the society. In the predemocratic
age, the ethical values of a greater or smaller group of leaders had infinitely
greater impact than that of the inarticulate masses. The evolution of many
societies, from a stage of kingly or aristocratic leadership to the rise of the
middle class, and from there to the participation of the “common man,”
clearly produces a progressive widening of the basis for the impact of in¬
dividual ethics upon social morality. But whatever the relative weight of
the different groups within a society may be, the social morality of a com¬
munity at any give time will be the composite of a multitude of ethical
values. The variety of the latter depends in turn upon the degree of moral
freedom. A liberal and pluralistic society will more easily reflect a variety
of ethical values than an authoritarian one. The same number of pacifists
may, in one society, produce a legal procedure for exemption of conscien¬
tious objectors from military service, while in another they may have no
impact at all upon the social morality and the legal order. Ultimately, a com¬
pletely conditioned society may reduce or eliminate this fear of individual
ethics. If, as is now no longer a fantasy, the increasing control over repro¬
duction, through the selected implantation or substitution of certain genes,
will be under the control of the masters of a society, individual values—as
forecast in Huxley’s Brave New World—will become an automatic and
standardized reflection of officially controlled genetics. The cultural counter¬
part is the all-pervasive control by “Big Brother” over the individual move¬
ments and actions of all individuals. All that we have said so far on the
impact of ethics, and the variety of ethical theories, is conditioned upon the
survival of social conditions in which individuals can still be produced, grow,
and develop with a degree of uniqueness.
In any society there is a close connection between social morality and
WOLFGANG FRIEDMANN 355

the legal order. There cannot be—and there never has been—a complete
separation of law and morality. Historical and ideological differences con¬
cern the extent to which the norms of the social order are absorbed into the
legal order. And while, in the traditional, more or less custom-bound society,
the flow was essentially in one direction, the gradual transformation of social
behavior into legal custom and from custom into legislative prescription, in
the contemporary, highly articulate and organized society, the law becomes
in turn increasingly a major factor in the formation of social morality.
This interrelationship cannot be bypassed by any legal theory which
maintains that law is a self-contained order of enforceable prescriptions.
The difference between certain “positivist” theories, such as those of Austin
or Kelsen, and others which in one way or another incorporate ethical postu¬
lates into the concept of law and the legal order, lies mainly in the question
of whether the metalegal foundations of a legal order should be sought inside
it or outside it. The “habitual obedience” which forms part of Austin’s defi¬
nition of law, or the “minimum of effectiveness” which is the condition of
the continuing validity of a legal system in Kelsen’s theory, but also “rules
which contain patterns of conduct for the exercise of force” (Olivecrona),
or the “peaceful co-existence of masses of individuals in social groups and
their cooperation for other ends than mere existence and propagation”
(Lundstedt)—all incorporate into the law a certain body of social norms,
whether the latter be stated as hypotheses or “facts” or parts of the legal
definition itself. In the words of one of the most strongly antiidealistic
Scandinavian realists, “The study of law must in the final analysis be a
study of social phenomena, the life of a human community; and jurispru¬
dence must have as its task the interpretation of the ‘validity’ of the law in
terms of social effectivity, that is, a certain correspondence between a
normative idea content and social phenomena.”80

LAW, MORALITY, AND SOCIAL CHANGE

Unless a minimum of conformity between legal order and social effective¬


ness is maintained by the various processes of legal evolution, a revolution
will ultimately destroy the existing legal order and substitute a new one.
When the feudal order that tied peasant serfs to the land was no longer ac¬
ceptable, the peasants fled to the free cities and eventually the feudal order
collapsed. When a majority of Negroes no longer accepts legal, economic,
or social inferiority to a white minority within a legal order, and the change
of the legal system through legislative, administrative, and judicial reforms
fails to keep pace with the change of moral pressure, a revolution will ulti¬
mately displace the former order. Sometimes the revolution will come from
outside, as in the destruction of the Nazi order by the majority of nations that
were willing to fight against it.
356 2. STRUCTURE OF SCIENCE

The normal process of interrelation between social morality and legal


order is one of evolution, i.e., the use of the instrumentalities of legal change
for the reduction of tension between the two types of normative order. The
intensity of this process of interaction is decisively determined by the degree
of organization of a society. Generally in primitive societies the reach of
authority, and therefore of law, is limited by physical conditions and social
tradition. Most of the social life moves beyond the law, which is concerned
with minimum order—defense, a rudimentary system of justice and police,
and a minimal revenue system sufficient to maintain government. It is only
against this background of undeveloped and slow-moving societies that the
theories of Savigny, Ehrlich, and other advocates of custom as against law¬
making can be understood. In contemporary society, the reach of the law is
far greater, and there are correspondingly closer relations between the legal
order and social morality. The transition can, in our time, be closely ob¬
served as the many new states of the postwar world seek to transform them¬
selves from traditional static and agricultural societies into societies that
aspire at economic development, diversification, and social change. The legal
machinery becomes the paramount instrument of social change. In the
process it often becomes necessary for the law to impose new patterns of
social behavior upon the society.
Thus it may become necessary for the state that seeks economic and
social development to destroy existing patterns of land ownership, especially
where they are linked with tribal custom and family tenure. In order to
become a modern society, India found it necessary to legislate the abolition
of the caste system and polygamous marriage. The fact that the legislation
has hitherto been far from effective, especially with regard to the abolition
of the caste system as a continuing pattern of social life, shows that the
power of the law to influence and change social morality is as yet far from
unlimited.
The majority of legal systems move between what Strawson has called
“maximum” and “minimum” morality, i.e., they vacillate between the in¬
corporation into law of those moral conditions which are crucial to the
survival of the legal structure and the transformation of all or most of the
social norms of the community into legal norms. The question will often
arise: What in fact are the minimum moral conditions essential for the
survival of society which therefore require their hardening into legal norms?

LAW AND THE ENFORCEMENT OF MORALS

This question has been the subject of sharp controversy in recent years,
against the background of two important aspects of the relation between
law and morality, one contained in a decision of the House of Lords,81 the
other in the report of the Wolfenden Committee published in 1957. In
WOLFGANG FRIEDMANN 357

Shaw’s case, the defendant had composed and procured the publication of
a magazine called The Ladies’ Directory which gave the names and ad¬
dresses, as well as nude photographs, of prostitutes, supplemented by a
coded indication of their sexual practices. Although Shaw was clearly guilty
of two statutory offenses, i.e., publishing an obscene libel and living on
the earnings of prostitutes, the House of Lords, with only one dissent, also
convicted him of a “conspiracy to corrupt public morals.” The House of
Lords here emphatically asserted “a residual power, where no statute has
yet intervened to supersede the common law, to superintend those offenses
which are prejudicial to the public welfare” (Lord Simmonds). The sub¬
sequent discussion, to which the most prominent contributions are Pro¬
fessor Hart’s Law, Liberty and Morality (1963) and Lord Devlin’s The
Enforcement of Morals (1965), centered around the question of how far
the law should go in legislating on morality, beyond the elementary needs
of public order.82 The same problem was raised, in a socially more serious
context, by the report of the Wolfenden Committee which recommended,
by a majority of twelve to one, that homosexual behavior between con¬
senting adults in private should no longer be treated as a criminal offense.
The crucial issue of the relation between law and social morality is put in
the words of the report itself:

Unless a deliberate attempt is made by society acting through the agency


of the law to equate this fear of crime with that of sin, there must remain a
realm of private morality and immorality which is, in brief and crude terms,
not the law’s business.

The question of what the proper sphere of law is, in relation to morality,
was the main subject of the debate between Professor Hart and Lord Devlin.
The former based himself essentially on John Stuart Mill’s essay On Liberty,
in which Mill said that “the only purpose for which power can rightfully be
exercised over any member of a civilized community against his will is to
prevent harm to others.” By contrast, Lord Devlin maintained that the state
may claim on two grounds to legislate on matters of morals. It could func¬
tion to promote virtue among its citizens—the Platonic ideal—and there¬
fore claim “the right and duty to declare what standards of morality are to
be observed as virtuous and must ascertain them as it thinks best.”83 This
conception of the state, which invests it with the power of determination
between good and evil, destroys freedom of conscience, and paves the road
to tyranny, is unacceptable to Anglo-American thought. Alternatively, “so¬
ciety may legislate to preserve itself.” In Lord Devlin’s judgment, the House
of Lords in Shaw’s case had done just this when it sought to indict the de¬
fendant, inter alia, for corruption of the moral welfare of the State. And it
was a jury of twelve reasonable men, representing the moral standards of
the society rather than the educated elite, that best represented the moral
standards of a society.84
While this debate is highly relevant to the question whether and to what
358 2. STRUCTURE OF SCIENCE

extent a law should, in contemporary British society, interfere with actions


that, however contrary to predominant sexual morality and practice, are
carried on in private and therefore do not directly affect the public, it does
not elucidate the theoretical question of the relation between law and social
morality. As we have seen, the dimensions of public order vary greatly from
one type of society to another, both historically and ideologically. In a
theocratic or totalitarian society, the regulation of sexual practices or of
freedom of discussion, even in private, may be eminently a matter of
“public order,” whereas in a liberal contemporary democracy, influenced by
modern psychological, criminological, and sociological studies, male homo¬
sexuality carried on in private may be regarded as being of no concern to
public order.85 The Spartans approved of homosexuality because they be¬
lieved that it promoted courage in battle.86 A Spartan type society might
well legislate for the promotion of homosexuality, private or public, as being
an important aspect of “public order.”
The essential theoretical lesson of the discussion that has arisen from
Shaw’s case and the Wolfenden Committee report is that modern, articulate,
and highly organized society, equipped with a multitude of media of com¬
munication and information, has the means and the power to transform
preferred moral standards into law, but that the question of how much of
social morality should be regulated and promoted by law is one deeply de¬
pendent upon differing social ideologies and ethical valuations.
The only general conclusion to be drawn is that, in any society that pre¬
serves a modicum of individual responsibility, there is a tension between in¬
dividual ethics and social morality on the one part, and social morality and
the legal order on the other part. How much these three spheres of norma¬
tive order influence and modify each other is a question that cannot be
answered in absolute terms.

NOTES
1. This article appears as a chapter in Wolfgang Friedmann, Legal Theory, pub¬
lished by Stevens & Son, 1968, and is reprinted by permission of Stevens & Son.
la. Fuller, The Morality of Law, p. 46.
2. Province of Jurisprudence Determined, Lecture 1, Library of Ideas Edition,
p. 19.
3. This description is fully applicable only to municipal systems, and not to con¬
temporary international law, which lacks both a clearly defined sovereign and
effective sanctions. Whether international law is recognized as law, “or merely as
positive morality” (Austin) depends on the relative importance one attached to
customary observance of a rule or a system, as against the requisites of command
and enforceability.
4. Ross, On Law and Justice (1958), pp. 32, 34.
5. Modern anthropologists have, however, corrected the impression that primitive
societies are generally devoid of legal institutions. Quite elaborate adjudication
procedures exist in many primitive societies. See, e.g. Hoebel, The Law of Primitive
Man (1954), pp. 22 ff.
6. The Existence of a Legal System, 35 N.Y.U. L. Rev. (1960), at p. 1029.
7. Hart, The Concept of Law (1961), pp. 89 ff.
WOLFGANG FRIEDMANN 359

8. The distinction is, however, too categorically formulated, since the studies of
legal anthropologists, such as Hoebel, Gluckman, Llewellyn, Diamond, and others,
have shown that judicial authority—though not legislative institutions or rules of
change—is often very strongly developed in primitive societies.
9. See, in this sense, Ross, op. cit. at p. 60; Hughes, op. cit. at p. 1029.
10. Legal System and Lawyer’s Reasonings (1964), at pp. 178 ff.
11. For an interesting presentation of various structural models of a legal system
in a modern state, as applied respectively to developed, less developed, federal, and
unitary states, see Akzin, “State and Law Structure,” in Law, State and International
Legal Order, Essays in Honor of Hans Kelsen (1964), at pp. 1, 8 and following.
12. Fuller, The Morality of Law (1964), Chap. II, pp. 6 ff.
13. This is illustrated by the problem of strict liability in tort and crime, although
it does not become very clear what the learned author regards as “impossible,” since
strict liability is clearly, in the author’s admission, a frequent and justifiable phenom¬
enon of modern law. The internal morality of law demands only “that it define as
clearly as possible the kind of activity that carries a special surcharge of legal re¬
sponsibility.” {Op. cit., p. 75.)
14. Op. cit. at pp. 59 ff. By a somewhat strained reasoning, Fuller seeks to show
that the latter are not really retroactive, because “men are in effect penalized for
what the law originally induced them to do.” For the contrary and more persuasive
interpretation of such tax laws as retroactive, see Hart, Book Review, 78 Harv. L. Rev.
(1965) at pp. 1284 ff.
15. “Positivism and Fidelity to Law,” a reply to Professor Hart, 71 Harv. L. Rev.
(1958) at p. 660.
16. Hart, 78 Harv. L. Rev. (1965) at p. 1284.
17. Fuller, The Morality of Law at pp. 145-151. The conviction that purpose
in law imparts to it certain moral characteristics also underlies Fuller’s earlier dis¬
cussion with Ernest Nagel: Human Purpose and Natural Law, 3 Natural Law Forum
(1958), 68 ff.
18. Fuller, Irrigation and Tyranny, 17 Stan. L. Rev. (1965) 1021, at p. 1033.
19. The Morality of Law, at p. 145.
20. See Hannah Arendt, The Eichmann Trial.
21. Nicomachean Ethics.
22. “Due process” must, of course, be understood here in a purely procedural
sense and not in the substantive sense given to it by the U.S. Supreme Court in the
interpretation of the Fifth and Fourteenth Amendments.
23. Ginsberg, On Justice in Society (1965) at pp. 62 ff.
24. Op. cit. at p. 71.
25. The difficulty of finding agreement on “relevancy” or “arbitrariness” is illus¬
trated by the following statements reported by Calvin Trillin in the New Yorker,
December 4, 1965, at p. 144—in a report on racial discrimination in Britain:
“Some labor-exchange managers . . . insist that a company’s refusal to hire colored
people represents neither personal prejudice nor racial discrimination:
“ ‘It is true that there are some firms that do not hire over a certain percentage
of colored workers,’ says a Youth Employment Officer.
“ ‘But isn’t that color discrimination?’
“ ‘No. The employer may know that if the balance is tipped his white workers
will leave.’
“ ‘Color discrimination.’
“‘No. It’s not that they’re white workers. It’s that they’re the experienced workers,
the ones who have been there long enough to know the business.’
“ ‘But the reason they leave—isn’t that color discrimination?’
“ ‘Oh, no. It might be any number of things. The white workers might not want
to work next to someone who smells of garlic. It’s not a matter of color.’
“ ‘Wouldn’t assuming that a man smells of garlic because he’s colored be color
discrimination?’
“ ‘It’s not the employer’s color discrimination.’ ”

“An honest and progressive English journalist who believes that people must be
taught to regard immigrants as individuals also believes that an automobile-insurance
company’s charging higher premiums for all colored people is no different from its
charging higher premiums for all journalists.”
26. L’ldee de Justice (1949) English Edition, The Idea of Justice and the Prob¬
lem of Argument (1963).
27. Op. cit. at p. 60.
360 2. STRUCTURE OF SCIENCE

28. With Albrechts-Tyteca, Traite de iArgumentation (1958).


29. Perelman, Justice and Justification, 10 Natural Law Forum (1965), pp. Iff.,
at p. 6.
30. Op. cit. at p. 16.
31. Op. cit. at p. 20.
32. For an excellent critique of Perelman, see Stone, Human Law and Human
Justice, 1965, pp. 346 ff.
33. Ehrenzweig, Psychoanalytical Jurisprudence: A Common Language for Bab¬
ylon, 65 Col. L.R. (1965) 1331.
34. The Rediscovery of Justice (1946); Prolegomena.
35. Bienenfeld, Prolegomena 22 ff.
36. See, among many others, Broad, Five Types of Ethical Theory (1930), pp.
276 ff.; Frankena, Ethical Theory, in Philosophy (1964), at p. 347: “The main
concern of ethical theory or moral philosophy in our period has been a so-called
meta-ethical question rather than normative or practical ones.”
37. Philosophy, Vol. 37 (1961), p. 1.
38. Op. cit. at p. 4.
39. See, for example, among recent writers, Kantorowicz, Definition of Law
(1958), at p. 43.
40. See, among many other surveys, Frankena, Ethical Theory, in Philosophy
(1964), at pp. 347 ff.; Frankena, Ethics, in Foundations of Philosophy Series (1963);
Broad, Five Types of Ethical Theory (1930); Nakhnikian, Contemporary Ethical
Theories and Jurisprudence, Vol. II, Natural Law Forum, pp. 4 ff.; Ginsberg, On
Justice in Society (1965), Chap. I.
41. See Frankena, op. cit. at p. 356.
42. Frankena, op. cit. at p. 349.
43. Another distinction—in terms of end values—is that between teleological and
deontic ethical theories. In the former, the “good” is the end value, while duties and
rights are derivative (Aristotle). For the latter, rights and duties are primary, and
“good” is derivative (Kant).
44. Reine Rechtslehre, 2nd ed. 1960, at p. 200.
45. Ross, On Law and Justice (1958), at p. 34. For other discussions of validity,
see Stone, Legal Systems and Lawyer’s Reasonings, pp. 202-205; Hart, A Concept
of Law, Chap. VI; Christie, The Notion of Validity in Modern Jurisprudence, 48
Minn. L.R. (1964) 1049.
46. Chisholm, in Philosophy (Princeton Studies, 1964), at p. 244. Empiricism
is contrasted with “commonsensism” and “skepticism.”
47. Nakhnikian, op. cit. at p. 7.
48. H. L. A. Hart, in The Concept of Law (1961), Chapter IX, pp. 189 ff.
49. Op. cit. at p. 189.
50. This principle could easily be contained in the first.
51. There have been many definitions of the concept of “value” which, as ob¬
served by a contemporary philosopher of ethics (Frankena, Ethical Naturalism, in
Philosophy, at p. 360), has become central in present-day thought. Generally ac¬
ceptable definitions would be those of Everett (Moral Values, 1918, at pp. 6ff.:
“The principle which determines the subordination of one end to another”) or of
Perry (Realms of Value, 1954, at p. 3: “A thing—anything—has value, or is valu¬
able, in the original and generic sense when it is the object of an interest—any
interest.”).
52. The present writer’s illustration, not Dewey’s.
53. E.g., Theory of Moral Life, or in particular one of his last papers, “Ethical
Subject Matter and Language,” in Journal of Philosophy, Vol. XII (1945).
54. “Ethical Subject Matter and Language,” op. cit., at p. 711.
55. 1935, at p. 238.
56. Human Society in Ethics and Politics, Preface, pp. vi-vii.
57. See particularly op. cit., Chapter IX, “Is There Ethical Knowledge?”
58. The Place of Reason in Ethics (1950).
59. Op. cit. at p. 71.
60. Op. cit. at p. 69.
61. Op. cit. at p. 160.
62. On Justice in Society (1965).
63. Op. cit. at p. 40.
64. Op. cit. at p. 46.
65. For a more detailed analysis of the evolution of the law of contract see
Friedmann, Law in a Changing Society, 1959, Chap. 4.
WOLFGANG FRIEDMANN 361

66. In the matter of contracts, by wage statistics collective agreements and other
relevant aspects of the terms of employment.
67. Appendix 1, “Concerning Moral Sentiment.”
68. Language, Truth and Logic (2nd ed., 1946).
69. Op. cit. at pp. 111-112.
70. Op. cit. at p. 14.
71. On Law and Justice (1958), at p. 274.
72. Reine Rechtslehre (2d ed. I960), pp. 89 ff.
73. E.g., by Nakhnikian, op. cit. at pp. 23 ff.
74. In this respect, the “language philosophers’ ” approach may be compared to
Dewey’s “logic of enquiry.” Dewey’s work and American pragmatism have, how¬
ever, been all but completely ignored by English philosophers. See the recent admis¬
sion by A. J. Ayer (The Listener, Nov. 4, 1965): “My conception of philosophy as
an activity of analysis owed a great deal to Moore, as well as to Wittgenstein. Only
the emotive theory of ethics, the idea that moral judgments were expressions of
feeling and so neither true nor false, was relatively new. I recently discovered that
my synthesis of these ideas was similar to the position taken by William lames,
under the heading of pragmatism, or radical empiricism, thirty years before me.”
75. 70 L.Q.R. 37 (1954).
76. For a brilliant dialectical presentation of five alternative approaches, see
Fuller, “The Problem of the Grudge Informer,” in The Morality of Law, pp. 187 ff.
77. In the terminology of Frankena in Philosophy, at p. 446.
78. Something approaching 100 percent execution was attempted in the immediate
postwar Allied legislation that proposed a great variety of sanctions, for virtually
the entire German nation, in the light of a questionnaire comprising 132 questions
designed to show the degree of participation in the Nazi regime. As was to be fore¬
seen, this system broke down under its own weight, in view of numerous exceptions
made for those who were important enough to be necessary in the postwar admin¬
istration, the cooling off of revenge sentiment which set in after the first year or so
had elapsed, and the sheer impossibilities of fair administration. See on the entire
problem Friedmann, The Allied Military Government of Germany (1947), espe¬
cially Chapter 7.
79. Frankena, op. cit., at p. 435.
80. Ross, On Law and Justice (1958), at p. 68.
81. Shaw v. Director of Public Prosecutions [1962], A.C. 220.
82. The further question, to what extent the law courts, as distinct from the
parliamentary legislator, should do what the House of Lords did in Shaw’s case,
need not be discussed in the present context.
83. Op. cit. at p. 89.
84. Op. cit., pp. 89 ff.
85. For an excellent discussion of these questions, see Ginsberg, op. cit., pp.
230 ff.
86. Russell, Human Society in Ethics and Politics, at p. 99.
.


3. The Construction
of the Good
METAPHORS, ANALOGIES, MODELS, AND
ALL THAT, IN ETHICAL THEORY
Abraham Edel

It is nothing new that a successful theory in one field finds its uses, or
echoes, in another, as the seventeenth century used Euclidean models in
ethics, and the eighteenth used Newtonian models in economics and ethics,
and the nineteenth came to use evolutionary models. Once attention
is directed on this problem, many other models come readily to mind.
Think of the part that has been played in ethics by transfer to it of a legal
notion of contract, a medical conception of health, a psychological con¬
ception of unavoidable pressures demanding outlet, a biological conception
of an organism whose members have diverse functions to perform in the
maintenance of the whole. Such models are not, however, always applied
at full strength. The application of a Euclidean or a Newtonian model is
the extreme case. Others, less systematically worked out, may operate only
in bits, often no more than by suggestion. Even when a theory in one field
colors the whole intellectual climate, its effect in other fields may be only
through analogy or metaphor. When the mind geometrizes, many things
everywhere are felt to “follow logically” no matter how large the gaps in
strict connection. When Newton reigns, even feelings begin to “gravitate
together,” and when Darwin rules, every field is busily finding it has an
“evolution.” That is why no sharp division can be drawn between the
study of full-fledged models and that of metaphors, analogies, and sug¬
gestive comparisons generally. The common problem is the influence of
one field in shaping or affecting another.
The present inquiry aims to raise in some systematic fashion this whole
question of models in ethical theory by asking in what way and how far
different models have helped shape conceptions of the nature and tasks of
ethics, influenced the detail of an ethical theory, and even furnished integra¬
tive concepts for its organizing structure. Section I considers the role of
models in the prior commitments of an ethical theory—its assumptions
about the world and man, about human nature and the human predicament.
Sections II to IV consider in increasing strength the role that models,
metaphors and analogies play in ethical theories. Section II starts with
casual use. Section III goes on with heuristic uses, where the comparison
serves suggestive functions in developing the ethical theory. Section IV
364
ABRAHAM EDEL 365

deals with structural uses in which the model organizes the ethical theory
and furnishes, in effect, its framework. Section V consists of some con¬
cluding reflections.

Every ethical theory has prior commitments in that it operates with some
assumptions about the kind of world men live in, what men are like, what
typical situations they find themselves in, what are unavoidable problems
or conflicts in the human situation, and so forth. Now these psychological
or social or historical or religious or metaphysical “foundations” of the
ethical theory often already contain embedded metaphors or models that
played a part in the development of the underlying field. The consequences
of this model now extend to the ethical theory.
Plato’s psychological commitments in his Republic constitute an excellent
example of this process. As is well known, Plato uses for the human makeup
a metaphor in which each of us is said to consist of three parts—a human
or rational part, a lion which represents the noble emotions of shame and
indignation as well as the ambitious desire for prestige, and a dragon which
represents the appetites. This dragon combines blind aggression, sexuality,
and acquisitiveness; it is incapable of controlling itself, it is arbitrary and
capricious, and it demands immediate expression. It is tempting to call
this the hydra model of appetite, after the creature that Hercules fought,
which grew two heads to replace every one that he cut off; for though
Plato does not mention this myth in that particular context, according to his
treatment appetite clearly grows stronger as it feeds or is permitted expres¬
sion, rather than is appeased. Now Plato’s model, built into his psychology,
gives a definite shape to his ethics precisely because he leans so heavily
on the psychological commitments. For he ties his account of human
virtues to his picture of the parts of human nature and even defines the
central virtue of justice as the state in which the man in us, supported by
the lion, keeps the dragon in his place. The model thus shows that the ethics
will be a repressive one, and (with further assumptions about distribution
of proportional strength of the factors in men) Plato concludes that the
mass of men have to be held perpetually in an authoritarian mold.
By contrast, the Freudian hydraulic metaphor, conceiving of the flow of
libidinal energies and the dangers of their being dammed (assuming the
incompressible character of fluids), concentrates on repression as the
trouble spot, so that the general aim is to find satisfactory outlets. And
this vital difference from the hydra metaphor remains even when the
concept of the id is developed (employing other metaphors) with properties
of irrationality and demandingness quite similar to Plato’s appetite. An
366 3. THE CONSTRUCTION OF THE GOOD

ethics employing the Freudian psychology thus can set itself a quite different
goal from the repressive structure of Plato—broadly speaking, a goal of
increasing rationality through insight into oneself.
We must, of course, take care not to attribute to the model greater
importance than it may in fact possess in the psychological theory itself. It is
possible that a metaphor such as Plato’s may come after the fact to adorn
the achievement. After all, he himself gives us his dragon metaphor late
in the Republic (in Book IX), not in Book IV where his theory of the parts
of the soul is advanced. It is not easy to decide how strongly it has functioned
all along. The analytic discovery of the model in the presuppositions of the
ethical theory or in the ethical theory itself cannot as such then settle the
essentially biographical or historical question of what came causally first
in Plato’s intellectual development.
What has been suggested about the influence on ethics of models in prior
psychological commitments may hold equally in religious or metaphysical
or biological or sociological commitments. In the western tradition God
is conceived of as a father. This familial model accordingly transfers the
properties of the head of the household to the deity. The consequent stress
on the virtues of filial piety and obedience in the religion are not surprising.
And as the ethics as a whole is organized in religious terms, the familial
relations are projected through the religion upon the whole of life, and
a mutual interaction ensues in all those phases. In a similar process, when
the world is conceived on an organic model, with individuals as ‘ members
analogous to bodily parts, to demand sacrifice of some part for the good of
the whole is taken to be “natural.” The association of an evolutionary model
that underscores the struggle for existence with ethical “egoism” and of
an evolutionary model that underscores mutual aid with ethical “altruism”
admits of comparable analysis. In all these cases, a model originating in
some particular phase of life or some particular process, is brought into
a domain such as psychology or religion or biology which in turn plays a
background role in an ethical theory. This prior commitment of the
ethical theory serves to project the properties of the model into the moral
outlook on life as a whole.

II

In considering the role of models and metaphors within ethical theory


itself, casual use is to be distinguished from heuristic and from structural.
A use is casual when it occurs on isolated occasions and is not elaborated.
“All life is a battle” used in a moral context, no doubt conveys the general
notion that the moral virtues are to be identified with the military virtues
of disciplined struggle. Yet even if the assertion is repeated and becomes
standardized, it may still be no more than casual if it is regarded as self-
ABRAHAM EDEL 367

explanatory and put to no analytic use. If, however, we find the ethical
theory attempting to pinpoint the type of discipline required in war, the
elements of persistence and obedience and resignation, and shaping a
morality along these lines, the metaphor has clearly become heuristic. (How
far it is pressed is another matter—a really persistent use of the military
metaphor might inquire into different types of war and show the variety
of basic virtues and forms of social organization relevant for them, and even
raise paradoxes about the abolition of war as a moral ideal!) Finally,
if the theory attempts a systematic picture of the battle of life, if it asks
who is the enemy and who gives the orders, we may end up with a structural
use of the metaphor. Stoic ethics seems to approximate this. Epictetus
tells us to live our lives as if we were soldiers in constant danger of being
ambushed, not by what happens to us but by our reactions to what happens,
so that the enemy is located within ourselves. In addition to this focus
on self-struggle and self-mastery, we are told to obey the divine or rational
nature operative in things and so to be resigned to the lot in which we find
ourselves. In such extended use, the battle metaphor serves both as a key¬
note and as a structural framework for the theory.
Isolated casual comparisions are no doubt often purely literary expres¬
sions enhancing the style, and so may serve little or no role in the ethical
theory. When Kant remarks, early in the Foundations of the Metaphysics of
Morals, that a good will, though wholly lacking in power, would “sparkle
like a jewel in its own right, as something that had its full worth in itself’1
and goes on to play with the analogy, comparing usefulness to the setting
of the jewel that helps in commerce or attracts the inexpert, are we to
look for any further significance in the comparison? Or when R. M. Hare,
arguing that the prescriptive use of ethical terms is primary and the descrip¬
tive use secondary, says that bringing up children with the latter turns them
“into good intuitionists, able to cling to the rails, but bad at steering
around corners,”2 do we need to explore clinging and steering to bring
theoretical enlightenment in ethics?
Yet even such casual analogies, invoked and then forgotten, may in
their brief use do a serious job—or sometimes help avoid doing a serious
job! Kant’s comparison of a good will to a jewel shining in its own right
suggests that intrinsic value is obvious. It therefore saves him the trouble at
that point of analyzing the emotions of the beholder who sees an act of
good will. Kant is quite conscious of the problem he has postponed. He
adds a footnote shortly after that he may be thought to be taking refuge
in the obscure notion of “respect” for the moral law; eventually, in his
Critique of Practical Reason he goes into an elaborate treatment of the
notion, and in fact it turns out to be central to his position to maintain
that it is not an ordinary psychological sentiment.
Hare’s use of analogy on this occasion is also not without theoretical
significance. It says, in effect, that life has frequent novel turns, that an
368 3. THE CONSTRUCTION OF THE GOOD

attentiveness to the need for making fresh choices is central to ethics, and
that a prescriptive use of ethical terms enables us to deal more readily
with such turning-points. This important recognition of a pragmatic element
in deciding how to use ethical terms is passed off in the casual analogy.
Hare is thus able to insist that he is merely analyzing linguistic usage.
Relatively casual models and metaphors may be invoked to convey
a basic philosophical outlook in ethics. For example, N. Hartmann, in his
Ethics, often slips into a kind of astronomical model: there is an objective
heaven of values to be explored by turning our telescopes on it, and all
sorts of values lie there unseen and undiscovered. This kind of model
carries serious theoretical consequences, for it assumes that one who does
not see a given value is either looking in the wrong direction (passing
by on the other side of the street, says Hartmann), or is not sensitive
enough or not developed enough, or in a hopeless case perhaps “morally
blind.” This last metaphor has often been invoked by moral philosophers
to provide the reassurance that in a case of ultimate disagreement about
moral values, one of the parties to the dispute is wrong, so that the “objec¬
tivity” of morals need not be abandoned. Neither in the frequency of its
use nor in the importance of its theoretical consequences is this metaphor
casual. It is only so considered here because it is invoked usually as if
it were self-explanatory, so that it falls short of being heuristic. For all
it does is to suggest that there are, in some domains of the world, inabilities
to perceive what is there. Whether values constitute such a domain is not
settled by invoking the model. A fully heuristic treatment would have
to take the parallel more strictly and elaborate a theory of moral vision
and the conditions of its partial or complete absence which could be subject
to independent confirmation.
The reach of casual metaphor in ethical theory is much greater than
we are inclined to suspect. The reason is the generic one that so much of
ordinary language is shot through with metaphor that it would be surprising
if ethical terms were wholly free and pure.3 “Right” carries notions of
straight and upright, and “ought” of debt. We feel some kind of connection
between what is morally right and what is nonmorally correct, and again
we think of our obligations to others as what we owe them. But only a
sensitive historical-minded philosopher like Nietzsche seriously suggests
using the debt metaphor for heuristic purposes when he attempts to trace
the idea of obligation to the debtor-creditor relation (in his Genealogy of
Morals). And it takes a full anthropological perspective to see that the
comparison to debt may play a structural role in a whole ethics;4 again,
talk of “moral law” carries so much of the sense of command that major
philosophical confusions have arisen from the coupling of universality and
imperativeness. Ideals of “harmony” often come straight from music,
“equality” is frankly arithmetic or geometric, and even apparently logical
ABRAHAM EDEL 369

concepts of “consistency”, when used in ethics, carry directly the sense


of being able to stand together.
To the evidence of language that moral and ethical thinking is thoroughly
imbued with metaphor and analogy must be added that of the history
of thought in general, as well as comparative anthropological survey of
cultural differences. There are obviously some models in the history of
western thought whose use has been more than casual—most notably
that of art and craft in teleological philosophies and the machine in the rise
of scientific philosophies. These have tended to obscure what is probably a
great variety less systematically analyzed. Reproduction furnished a model
for ancient Greek theogony and cosmology. There must have been consider¬
ably more use of growth models than usually noted, in casting ideas of the
good; for surely a human history so bound up with the discovery and develop¬
ment of agriculture could not have overlooked this in its patterns of thought.
That fire played some role can be seen, if nowhere else, in the Stoic notion
of the divine fire as present in every man—the metaphysical formulation of
its cosmopolitan outlook. The story of the sun and of light in the history
of western philosophy is yet fully to be told. Its role as the visible model
of the Good in Plato’s ethics, the relation of illumination to thought (the
parallelism of light and active reason) in Aristotle’s theory of knowledge,
the sun shining without apparent loss of power as a model of emanation
in Philo and Plotinus, the unification of power, light, creativity, and goodness
in the sun, and the parallel combination in divinity—all these are tantalizing
items that a fuller history of thought models has yet to make richly clear.
Health models—the view of morality as a kind of health—are already found
in some primitive societies. And if we add to the physical models and
models from various crafts and processes of work others that come from
human institutions, the variety seems almost indefinitely increased.
Now while many of these comparisons are casual in the senses indicated
above, some models have obviously been put to work in ethical theory
for analytical purposes and some for structural purposes. Illustrations of
these theoretically more serious uses will be taken in the next two parts
from the history of ethical theory.

Ill

In the actual processes of ethical theory—the analysis of ethical concepts,


forms of ethical expression, modes of organizing moral data, methods of
validation and justification in ethics, articulation of criteria and systematic
standards—we find frequent resort to models in a heuristic fashion. Some
properties of the model are selected and elaborated to suggest a similar
pattern in the ethical material. Sometimes the model is simply used, and
370 3. THE CONSTRUCTION OF THE GOOD

we cannot be sure how conscious its employment is; sometimes there is


a conscious attempt to work it out. Let us examine a few illustrations.
Every now and then we come upon a political-legalistic model hard
at work in Kant’s ethical writings. We see it in his account of conscience
as practically an internalized court of law in which a man is judged or
condemned within his breast for violation of the law. Kant projects this
picture into his religious philosophy: God is assigned the properties of
holy lawgiver, good governor, just judge, precisely the properties Kant
has decided that ethics needs. Some of Kant’s arguments are directly
transferred from the model: for example, no man can be a judge in his
own case, and therefore the judge in conscience points to an impartial out¬
side being. Kant’s use of the model either influences or reflects his basic
view of morality as willed law.
The conscious search for paradigms of ethical terms in some special
way of using language—so characteristic of contemporary British ordinary
language analysis—is an excellent example of the heuristic treatment.
Most of these attempts agree in construing ethical utterances as practical,
but differ in the particular linguistic activity to which comparison is made.
While one selects emotional expression through language, another shifts
to persuasion, a third to commanding, a fourth to commending, a fifth to
performing, a sixth to advising and choosing, and so on. Sometimes all
ethical utterances have been construed along the lines of one model. But
more recently there has been a growth of refinement: one starts with the
ethical expression and asks which model to invoke. For example, it is
asked whether “ought” in its second-person use (“You ought to do it”)
has an imperative character, or whether in its first-person use (“I ought to
do it”) it has a performatory-decisional character. This is equivalent to
asking whether the command model (as in the “Thou shalt” of the Deca¬
logue) and the so-be-it model of performatory utterance whose exploration
J. L. Austin initiated (the “I do” of the marriage ceremony or the “I accept”
of the moment of contract) are primary models for ethical theorizing. The
same kind of point may be made in a nonlinguistic way. Nietzsche, for
example, reads a command character into the very act of willing, so that
my willing anything at all is my commanding myself. In this way he is able
to see willing as a central form of power striving and thus to assimilate ethics
to a power domination-submission model.
An interesting illustration of an unusual comparison to serve a specific
function—in this case to work out dimensions for evaluation—is Parker’s
use of the musical model.5 He not merely employs a general concept of
orchestration of values, but selects features of sound such as volume,
timbre and pitch alongside of the more familiar ones (such as intensity
and duration) to be given an interpretation in value measurement—quality
being the value analogue of timbre and height of pitch. He argues (as J.
S. Mill had done for quality of pleasure) that these are not reducible to
ABRAHAM EDEL 371

quantitative compounds, nor is height (for example) merely rank based


on other measurements. No more is claimed for the analogies than utility
in discovery; the features discovered, he says, could have been found in
any value field.
The monetary model is a more familiar one in value measurement,
and has had a wide scope, most notably in Bentham’s felicific calculus.
The properties he assigns to pleasure and pain—the homogeneity, quanti¬
tative additive character, possibility of trans-personal summation—are
obvious properties of money. When questions arise about difficulties in
applying the specific criteria of the felicific calculus, he sometimes explicitly
suggests a monetary solution. Thus he invokes such maxims—certainly
useful in domains of legislative reckoning—as that increase of happiness can
be assumed correlated with increase of wealth. Or else he turns pragmatic
and says that unless you use something like money as a standard for
pleasure you will not achieve an objective morality. The model by no
means exhausts the contribution of the felicific calculus, nor Bentham’s
contributions to the theory of ethical measurement, but it does operate as
a support.
In problems of validation and justification in ethics, the most powerful
model has been the Euclidean. An ethics was to take the form of a deductive
system, like Euclidean geometry, and validation of an ethical statement
would consist of proving it as a theorem within the system. This model
has been long under attack in ethics, in spite of its fascination, and the
very diversity of attacks shows how complex are the features of the model.
For the fact is that Euclidean geometry tied into a bundle at least five
features, which may be quite distinct: (i) the use of logical deduction,
(ii) certain types of first premises in the system—universal in form and
sometimes described as necessary, (iii) intuitive certainty or self-evidence
of the premises, (iv) no need for specific verification of the theorems once
they have been deduced in the system, (v) the unity of the system as a
whole in that it purports to cover its field by its relatively few initial
premises. Now in ethics, the use of logical deduction is attacked by volun¬
tarists and emotivists, among others; the insistence on universal premises
by those who take moral rules to be frequency statements summarizing
individual particular moral reactions; the intuitive certainty by empiricists;
the omission of specific verification by empiricists and inductivists and
even perhaps particularist intuitionists; the unity of the system even by
pluralists who accept all the rest but whose intuitionism stresses a list
of independent rules and problems with only contingent relations. Now
there are no doubt clusters and natural linkages among some of these
diverse criticisms, but the separability of the features is not affected. The
history of the Euclidean model thus shows that where it exercised a
powerful attraction it meant a major construction in ethical theory. In
this respect it might almost be regarded as a structural use of a model except
372 3. THE CONSTRUCTION OF THE GOOD

that it was so limited to the methodological question of form alone, and


seemed indifferent to the kind of content—whether it was a geometry
beginning with the nature of God or Substance, with premises about natural
law or about movement or about sentiments.
The lessons that emerge from these examples about the benefits and
dangers in the heuristic use of models may be deferred till we have dealt
with structural uses.

IV

Several of the models already mentioned or described above appear to


have a structural use. They provide central or integrating concepts and
the relations they elaborate remain as an organizing framework for the
ethical theory. The military model in Stoicism, the legalistic model,
especially with a divine lawgiver, the debt model briefly mentioned, have
either been of this sort or seem capable of such a development.
For a complex model which often takes us unawares in building assump¬
tions into ethical theory and eventually provides a fairly popular framework,
consider the medical model. In questions of the body it involves a well-func¬
tioning organization with integrated aims of survival and continued func¬
tioning, fairly clear indicators of the forces that upset equilibrium, self¬
regulating processes to counter such intrusions, and a fairly well-articulated
ideal of health to govern reflection for practice. When the model is trans¬
ferred into ethics, a parallel structure of assumptions is elaborated. There
is a determinate set of strivings .and needs whose satisfaction constitutes the
good; according to philosophical predilection it takes the form of either
an outright teleology or a “teleological mechanism.” It interprets the
doing of evil as either ignorance or the distorted pursuit of the good; it
rejects the demonic as abnormal, inclines to ideals of harmonious expression,
and often takes an optimistic view of man’s nature. Many of these tendencies
are also found in ethical theories that do not use a complete medical model,
but begin with a framework of drives or needs or impulses or desires as the
matrix of ethical inquiry.
In order to present the role of structural models in greater detail I shall
deal more fully with a comparison of two theories—Aristotle’s use of the
model of craftsmanship in ancient ethics,. and Ralph Barton Perry’s use
of a model of economic enterprise in the early twentieth century. Aristotle
frequently tells us that nature works like the artist or craftsman. Perry’s
theory has not, of course, the stature or influence of Aristotle’s, although
it did play a large part in the development of general value theory in
twentieth century American philosophy. But it has a transparency which
makes it an apt illustration. In fact he calls his first book on ethics The
Moral Economy.6
ABRAHAM EDEL 373

Aristotle s craftsmanship model is that of individual production of a


limited end in a particular case. The builder constructing a house or the
doctor restoring health by a specific treatment, as well as the sculptor
making a statue out of stone or bronze, are his favorite examples. The
process is guided by an inherent plan of the form to be achieved. For
Aristotle, nature works that way too, though unconsciously. (This is his
familiar concept of final causation, applied in biology and physics as well
as in human affairs.) The model refers thus to the small-scale craftsmanship
of his day, not to mass machine production of an endless stream of com¬
modities. Perry’s economy model too must be considered in the light
of the economy of the time at which he was writing—the first decade of the
twentieth century. We expect therefore to find the features of a free enterprise
business system with each enterprise pursuing its own affairs for its own gain,
cooperating with others for mutual profit and with an ideology of public
service as the outcome; and the whole accompanied by a sense of endless
possible progress.7 Given this brief sketch of the models, it remains to
see how permeating they are in the two theories, and what different
structures are the outcome. Let us compare (i) their conceptions of the
good, (ii) their modes of comparing or measuring values, (iii) their accounts
of what virtue consists in and what are the central virtues, (iv) their
general spirit and view of the nature and tasks of ethics.
(i) Corresponding to the position of end in the craftsman’s work, the
good is taken by Aristotle to be the central ethical concept. He employs
a means-end framework, hierarchically organized with means, sub-ends
and final end. The final fixed end for man is happiness, involving an
expression of the ordered capacities of his nature, culminating in his ration¬
ality. Aristotle’s good as natural end is quite different from the types of
organization that other models in the history of ethics have encouraged—for
example, overarching endlessly approachable but unattainable ideals, or a
particular quality such as pleasure to be extracted from experience in
endlessly increased amount, ft is the fixed goal of a craftsman who knows
what he wants and is working to it in a limited number of steps; Aristotle
dismissed aimlessness (having no goal) as folly, and endless pursuit of
the instrumental as vain. On the other hand, Perry starts with individual
interests as the units of life, each partial to itself and defining its good for
itself, but uniting in some form with others for greater scope and power,
and so by organization becoming an economy or community of interests.
There seems to be no predetermined control of the rise of interests, for
the fulfilment of any interest is taken to be good; the moral good, however,
is identified with fulfilment of an organization of interests. Instead of the
strict means-end hierarchy of the craftsmanship model, we have the rising
scale of cooperative organization from isolated interests, through reciprocity
of interests, incorporation of interests through a purpose, fraternity of
interests, to a universal system of interests (pp. 78-79).
374 3. THE CONSTRUCTION OF THE GOOD

(ii) In their modes of comparing or measuring value, Aristotle’s has a


constructionist character while Perry’s stresses the expansion of enterprise.
For Aristotle, the better is the more complete; he thinks in terms of a
whole of parts which one has in mind as the good, and whatever embodies
more of the elements that go to make up the whole is better than what
embodies fewer. In the Nicomachean Ethics there is in fact a dearth of any
other mode of judging the better than tying degree of goodness to extent of
completeness.8 Certainly there is nothing like the quantitative scale of the
hedonist, such as the Benthamite monetary model referred to above. Perry
does assert as an obvious truth that more good is better than less good;
but his model is not the accountant’s zest for growing profit, so much as
enterprise growth that subtends public service. Hence selection and grada¬
tion of interests is made on the basis of contribution to the collective body.
Prudence rises to moral purpose as there is extension and organization of
more inclusive interest, and egoism is stupid provinciality. “Morality is only
life where life is organized and confident, the struggle for mere existence
being replaced with the prospect of a progressive and limitless attainment.
The good is fulfilled desire; the moral good the fulfilment of a universal
economy, embracing all desires, actual and possible, and providing for them
as liberally as their mutual relations permit.” (p. 72) Whereas the ultimate
good for Aristotle culminates in the contemplation of the eternal, as an
isolated achieved end, Perry’s “progressive and limitless attainment” has
to be safeguarded against such isolated absorption. His evaluation of art is
significant in this respect. Since the aesthetic interest absorbs us directly and
is self-sufficient, “its continuous return of good being guaranteed, it is one
of the safest of investments” (p. 192). But its isolation is a danger, and it
may yield a narrow concentration, inimical to progress. The value of reli¬
gion, on the other hand, consists precisely in the enlargement of the circle
of life to the world in its full sweep (p. 253).
(iii) There is a clear contrast in their treatment of virtue. For Aris¬
totle, virtue is the end product of learning by practice, almost as an
apprentice under the eye of the master. While no doubt a familial model
is invoked at some points, much of his account of the man of practical
wisdom serving as the model is suggestive of master-apprentice relations.
The differentiating mark of virtue is action according to the mean. Many
themes do enter into explication of the idea of the mean, but the dominant
one is the idea of the just-right, an artistic or craftsmanlike fashioning of
the new materials of the human makeup to yield habits of selection that
will avoid both excess and deficiency. And this is the mood that pervades
the long treatment of specific virtues; in each case we have a picture of
raw materials being worked up into a mean-organization. In Perry, on the
other hand, the virtues are derived from the scale of cooperative organiza¬
tion, each economy having its characteristic principle or typical mode of
action—for example, intelligence for the isolated interest, prudence for
ABRAHAM EDEL 375

reciprocity of interest, and so on up to good-will for the universal system


of interests. And he charts typical forms of errors for each level as well
(P- 81).
(iv) Not only is their general spirit different, but their conception of the
nature and tasks of ethics shows the difference in their basic models.
The craftsman makes his product according to definite and well-established
procedures and transmits his craft to his apprentices. Aristotle’s ethics
conceives of a fixed pattern of life with fixed goals and definite virtues trans¬
mitted from generation to generation. The task of ethics is to build these
virtues and to cultivate the knowledge which brings the pattern to full
consciousness in those capable of such rationality. The culmination of the
good is the contemplation exercised by the truly wise. For Perry, the nature
of ethics is simply to offer “the most competent advice as to how to pro¬
ceed with an enterprise whether large or small” (p. 2). Its tasks are to build
more and more harmonious enterprise and to ensure that progress is not
ended.
It is perhaps worth noting that both Aristotle and Perry even in
casual remarks seem to have their models constantly in mind. Aristotle
says that if the best cannot be achieved we aim at what is closest to it, just
as (he adds) a craftsman does what he can with the material at his disposal!
And Perry is constantly speaking of “safe investments” and “yielding a
steady return.”

The conclusions to be drawn from this inquiry and its illustrations go


along three lines. First, what kinds of effects have we found the use of
models and the like to have in ethical theorizing? Second, how far is their
role one that can or should be dispensed with, or have they some distinc¬
tive value? Third, insofar as they are used, what criteria can be suggested
for evaluating them? Obviously such reflections deal with heuristic and
structural use, not with casual employment.
On the first question, there is an obvious sense in which models seem to
be doing nothing essentially different from what would have to be done
without them. An ethical theory with or without models makes factual
assumptions, has some value commitments, uses a particular linguistic
apparatus, and shapes itself along the lines of a given structure. In the
illustrations given above, such results are in part secured through the use of
models. The Platonic metaphor of the dragon carried the factual assump¬
tion of the inability of the mass of men to control their appetites without
repression; the metaphor of moral blindness hypothesized that there is
a type of value-vision which may be blinded; the Aristotelian model as¬
sumed that all men pursue one ultimate end. Again, on imposing specific
376 3. THE CONSTRUCTION OF THE GOOD

values in the process of shaping ethical theory, insofar as a model molds


an ethical theory it contributes to the orientation that morality will have
in action. A legalistic model turns almost to worship of universality, a
craftsmanship model condemns aimlessness, a business model makes us
uneasy about lingering in the present with intrinsic values, a geometric
model imparts a rational character to our methods. Models, too, help
fashion the language of ethics. The craftsmanship model leads to an
equation or near-synonymy of “good” and “ends” or “intrinsic ends,”
monetary models and business models promote “value,” and legalistic
models turn ethics to a concentration on “moral law” and “obligation.”
Finally, that models help furnish a structure for ethical theories has been
illustrated in detail.
If these jobs can be done with and without models, are models dispens¬
able in ethical theorizing? How far are they simply imaginative diversions,
how far are they replaceable as analytical tools, or do they furnish genuine
theoretical components in the resultant theory? Certainly they are im¬
aginative. To think of life as an art, or as perpetual warfare under an
outside commander, or as constant self-legislation for a community of
equally free spirits, or as a hopefully prospering business, or as anguished
steering in an uncharted sea without even a port to aim at, is an imaginative
feat. Great models in ethical theory are profound creations. Whether they
are diversions is a more difficult question. They certainly are used seriously,
they often express basic trends, institutions and problems of a given age.
From the point of view of intellectual history, models in ethics, like basic
models in science, prove very revealing. They are worth serious study by
the historian of philosophy.
The formulation that a model is an analytical tool is itself a metaphor.
This metaphor suggests that the tool is not part of the product, and even
that what was brought about in one way may be brought about in others.9
In all the contributions of models as analytical tools, and even in their larger-
scale structural use, we might expect theoretical progress to take the form
of stating explicitly factual assumptions imported, values accepted or
shared, linguistic notions elaborated, and structural framework proposed.
At that point past models could be dismissed with thanks, or retired on a
pension of humanistic gratitude! Future work could go on in transparent
scientific terms. This is the position that sees all forms of models as at most
psychological aids, ways in which a discovery happened to be made or a
result achieved rather than as part of the structure developed.10
Such a view looks at the work as finished, the results as extricated and
separably formulable. It says, as it were, that when all is over and done
models will not be needed. But when is all over and done? In practical
terms, certainly, the kinds of arts and crafts change, and the craftsmanship
model can acquire richer meaning by attention to diversities—not to speak
of machine production and its conceptual effects! Similarly, even during
ABRAHAM EDEL 377

Perry’s lifetime the forms of business enterprise changed; new types of


business association developed, corporations and cartels came to the fore.
Though Perry largely outgrew the model when he came to his General
Theory of Value, the proliferation of forms in business enterprise might
have suggested further theoretical problems. More importantly, from a
theoretical point of view, to displace the models completely requires an
adequate theory of the elements that are being supported by the model—
of ends in Aristotle, and of interests in Perry, and their relation to desire
or appetite, to reflection, to pleasure and pain, and so forth. Even Dewey,
who in twentieth century ethics tried most directly to grapple with the
psychology of ethics and to relate ethical concepts clearly to revised psy¬
chological conceptions, constantly found it necessary to invoke models of
various sizes—from his early criticism of the reflex arc model to his signifi¬
cant analysis of ends (in Human Nature and Conduct) as functioning like
targets set up to shoot at rather than goals of action; or again, his modelling
of ethical evaluation (in his Theory of Valuation) on all sorts of social
processes of grading and estimation.
What I am suggesting is that models can have a theoretical function of
serving as theory-surrogates—not in the sense of replacing a theory already
achieved, but of holding the place for a theory to come. We know, for
example, that a completed ethical theory requires some conception of the
self with core values embedded in it. Different models in ethical theory will
suggest that these will be transcendentally derived or biologically grounded
or culturally developed, that they will be fixed or that they will be changing.
The model indicates direction of inquiry and broad character—while it
waits for the theory, it furnishes a kind of theory-sketch. Or again, it
functions in a more general way by being so rich in suggestion that pro¬
posed theories may go off in different directions from it, none of them as
yet adequate to take over alone. In any case, however, the effort to re¬
duce and eliminate a model is a necessary one; for it is equivalent to
progress in discerning the precise relations required for developing theory.
The question is whether that progress can in fact take place—not just be
programmatically insisted on.
As long as models are employed, there is need of criteria for evaluating
their effectiveness. Some of these are obvious—suggestive power, discern¬
ment of new relations, fruitful refinement of concepts and distinctions, com¬
prehensive scope. These are the methodological values of any directed and
systematic inquiry. More specific criteria of effectiveness come from the
specific role the model is playing and the detailed consequences of its use.
If it imports factual assumptions, are they correct? If it brings value-atti¬
tudes, what are their consequences? If it alters linguistic uses, what refine¬
ment is secured? And so on. Perhaps the most important general require¬
ment one might suggest for a model in ethical theory is that it should not
engage in smuggling factual and value assumptions but should declare and
378 3. THE CONSTRUCTION OF THE GOOD

even flaunt them; and in its general impact, it should wake ethics from its
dogmatic slumbers!
The implications of such a requirement are perhaps more far-reaching
for the treatment of models than might at first seem to be the case. For the
prevalent tendency has been a kind of tyranny of the model. A model takes
over, extends itself and dominates theoretical construction. Even a limited
analytical model often does this. For example, C. L. Stevenson began in
his Ethics and Language11 with the phenomenon of interpersonal disagree¬
ment and built a model for interpreting ethical terms out of this, analyzing
ethical language as persuasive. He did not add an alternative model in which
ethical terms would be interpreted, for example, through the situation in
which men sharing some basic agreements seek to widen the area of their
agreement (though he does examine patterns relevant to this as a subsi¬
diary matter in the relation of means and ends, in chapter VIII). Con¬
sequently, the model worked out on disagreement as the basic phenomenon
found it hard going when it moved to the region of personal ethical de¬
cision. Would it be possible to analyze a man’s internal deliberation as
trying to persuade himself? What Stevenson actually did was to supple¬
ment “disagreement in attitude” with “uncertainty in attitude” and then
try to interpret such individual uncertainty in deciding as inner conflict in
which one part of the personality attempts to control another.12 While this
no doubt suits some types of value decision, it clearly imposes a single
character on what is probably a rich variety. To recognize the possibility
that a family of models may be required for ethical terms rather than a
single model would be an initial way of escaping possible tryanny.
If even a limited analytical model tends toward tyranny, how much
more tyrannical is the practice of structural models. Yet if the analysis of
the structural models as theory-surrogates is appropriate, the maintenance
of alternatives would seem as necessary as the consideration of alternative
theories. This is of central importance both in the study of past models and
in the construction of fresh ones. Let us look in conclusion at each of these
points.
The basic contribution of the historical study of models in ethical
theory is to the development of comparative ethics. It makes us conscious
of the alternative structures that have been found, and their types or
families, and so the different ways in which problems of ethical theory
have been formulated. Take, for example, the major contrast between the
juridical model and the inductive Newtonian model. The first is oriented
to seeking laws for decision with a view to regulating behavior. The second
tries to find out how people actually behave, whether in the movement of
their desires or the paths of their sentiments (Hobbes, Hume, Adam Smith
and Bentham alike find their place here), with a view to expediting their
behavior and minimizing entanglements. If we recognize that the different
types of models develop their own concepts in terms of their own demands,
ABRAHAM EDEL 379

we are not perturbed, for example, by finding that the “ought” in the
juridical sense is not formally deducible in the Newtonian model. Hume’s
famous passage on the problem of transition from “is” to “ought” indi¬
cates no more than the disparity of the models. What we have to ask instead
is how regulative functions are formulated in the Humean model. Such
comparative study breaks through the tyranny of a single model.
Similarly, in constructing, the only feasible path—unless it prove pos¬
sible to avoid all models—is their multiplication, even beyond necessity.
Much might be gained by going beyond lining up familiar ones in contrast,
and carrying out speculative development of unexploited analogies, meta¬
phors, models, if only to free ourselves by their mutiplicity from any hold
that unnoticed ones may have in our thinking. I am not suggesting off-the-
cuff invocation of neglected metaphors, as perhaps a jeweller might say
that his profession has been neglected compared to the textile field: life has
to be cut to size in such a way as to catch the light, only controlled skill can
produce the readiness for beauty which is caught in the glow of the gem,
that the rapt beholding is the prototype of the grasp of intrinsic value. I am
rather suggesting that the counterposing of models from many fields of
study of man can help in ethical theory by revealing multiple relations in
new lights.
Think, for example, of what might be done with an electrical model to
break open the question of the nature of “intrinsic value,” or what is often
called “an end in itself.” Suppose an electrical current running through a
filament in an ordinary bulb. At a certain point incandescence takes place.
The glow is a fresh quality arising under determinate conditions of ma¬
terial and electrical properties. So too ordinary human processes “light up”
as intrinsic values in the interactions of personal and social life. The value
quality is a special glow, not some special or added constituent. Such a
model may help counterbalance other physical models that exhibit goal¬
seeking, for example, as patterns of machine process. But even more, it
suggests by contrast that the notion of “an end in itself” already contains
a hidden model—the craftsmanship model. For unless we assumed that
every act had an end, why would we think of some activity that was not
being directed to something beyond it as being somehow directed to itself?
Why not rather that it is not directed to anything? It is almost as if in
the monetary model we assumed everything had to have a price to have a
value, and so what was not for sale must be priced infinitely high! The
notion of intrinsic good has much more packed away in it than we are
likely to suppose.13
In general, if we recognize that the world (whether in nature or in human
life) is always far richer than any single model, and also that there is con¬
stant change, then attention is shifted from the conflict of models to the
correlation of the various phases that the different models have grasped.
Even in the case of conflicting models—in the normative disciplines at
380 3. THE CONSTRUCTION OF THE GOOD

least—there need not always be a demand for a general choice between


them. Where they represent directly opposing forms of organization—for
example, social models such as socialist and laissez-faire capitalist, or ethical
models such as egoist, cooperative, altruist, etc., or the strenuous goal-di¬
rected life, the Epicurean immersion in the present, the Stoic inner-
oriented stern resignation—they may be looked upon as different possible
forms of organization. We can then ask under what conditions each in fact
applies, and under what conditions it is desirable that each should apply.
Attention thus shifts to the question of the criteria for a domain shifting
from the scope of one model to the scope of another. There are limits, of
course, to such procedures of dividing the question, or correlating alterna¬
tives. But sharp choice after the exploration of such procedures and with
a full sense of alternatives differs immeasurably from the tryranny of one
model. It is the difference—even in philosophy—between dogmatism and
insight.

NOTES
1. Lewis White Beck translation, The Library of Liberal Arts, p. 10.
2. The Langauge of Morals (Oxford, Clarendon Press, 1952), p. 75.
3. For some consideration of the roots of moral terms, see May Edel and Abra¬
ham Edel, Anthropology and Ethics, rev. ed. (Cleveland, The Press of Case Western
Reserve University, 1968), ch. 10.
4. See the account of Japanese ethics in Ruth Benedict, The Chrysanthemum
and the Sword (Boston, Houghton Mifflin, 1946).
5. DeWitt H. Parker, The Philosophy of Value (Ann Arbor, University of Michi¬
gan Press, 1957). See especially pp. 103-05 and 189 ff.
6. New York, Charles Scribner’s Sons, 1909, 1937.
7. For a discussion of Perry’s theory in its historical relations, in comparison
with Dewey’s, and in contrast with later 20th century ethical theories, see Abraham
Edel, “Some Trends in American Naturalistic Ethics,” in Philosophy in France and
the United States, edited by Marvin Farber (Buffalo, University of Buffalo Press,
1950).
8. In the logical works (Topics III) and in the Rhetoric, there are more varied
indices, largely taken from ordinary usage in preferential selection.
9. Our argument here is itself a good illustration of how a model works. It sets
reflection going. Why cannot a tool be used to achieve results and then have its
destined place in the completed product? A rock could be used as a hammer in con¬
struction and at the last moment be fitted in as a coping-stone! It is the very idea of
a tool that is here being transformed in such an objection. To call something a tool
probably means that it is being used to act only on other things. But we could revise
the term to refer to a mode of action instead of an object. There would then be no
tools but only tool-behavior of things. Compare the view that there are no slaves
but men held in servitude, with the Aristotelian conception of natural slavery.
10. It is worth noting as parallel to this controversy about the eliminability of
models in ethics, the general conflict on the question of models in science and phil-
osphy in recent years. Two opposing views have taken shape. One maintains that
some kind of philosophical orientation of outlook—in effect a metaphysical model—
is inevitable as an initial act in interpreting the world, so that one’s picture of the
world is in some fashion different from what it would have been by taking an alter¬
native stand. For example, Stephen Pepper’s World Hypotheses (Berkeley, Univer¬
sity of California Press, 1942) outlines four basic root-metaphors and shows how
each develops its own interpretation of what evidence itself consists in, so that there
is no outside way of adjudicating absolutely between them. Pepper proposes instead
to see the advantages of these different spectacles by applying each in turn to the
various fields of philosophical inquiry. Kindred tendencies are found in Whorf’s
linguistic hypothesis that different languages embody basically different categories
ABRAHAM EDEL 381

of thought, so that the range of translation of ideas is severely limited. (Language,


Thought, and Reality: Selected Writings of Benjamin Lee Whorf, ed., John B. Carroll,
New York, 1956.) For a critical treatment of Pepper’s thesis, with some comment
on Whorf’s relation to it, see Abraham Edel, “Interpretation and the Selection of
Categories” in Meaning and Interpretation, University of California Publications in
Philosophy, vol. xxv, 1950, esp. pp. 69-72.
The opposing view, familiar in the positivist tradition, holds that explanation by
the use of models in philosophy represents the prescientific trend of explaining by
assimiliating the unfamiliar to the familiar, and therefore may limit the develop¬
ment of fresh modes of thought. Ernst Topitsch, for example, continuing the posi¬
tivist tradition with historical sophistication, argues that the metaphysical views of
the prescientific period represent the use of biological, technomorphic and socio-
morphic thought patterns that developed in ancient philosophy, in part constituting
a refinement rather than a sharp break with mythological modes of thought. See
his “Society, Technology, and Philosophical Reasoning,” Philosophy of Science, vol.
XXI no. 4 (October 1954), pp. 275-96; also has Vom Ursprung und Ende der Meta-
physik (Vienna, Springer-Verlag, 1958).
Thomas S. Kuhn’s The Structure of Scientific Revolutions (International Encyclo¬
pedia of Unified Science, vol. II, no. 2, University of Chicago Press 1962) may be
seen as a counter-claim to the positivist thesis for science itself, since he views revo¬
lutions in science as primarily the replacement of one paradigm by another, where
a paradigm is simply some particular accepted scientific work in all its richness
which furnishes the modes of analysis and problems of work for the scientists of its
time.
11. New Haven: Yale University Press, 1944.
12. C. L. Stevenson, Facts and Values (New Haven and London: Yale University
Press, 1963), essays IV and XI, esp. pp. 191-203.
13. In Perry’s General Theory of Value, to find something intrinsically good
represents no more than the effort to maintain or perpetuate an object of interest.
Apparently the theory still wants to make sure that there is no real resting-place to
hinder progress.
ABSOLUTISM AND HUMAN RIGHTS
Sidney Hook

A fundamental question which arises when we consider the relation


between man and society is: How can we maximize human freedom and at
the same time preserve the peace and security on which ordered society
depends? Or to rephrase the question so that its political character becomes
obvious: How can we maximize human freedom—which as a first approxi¬
mation I define as the right to act without let or hindrance by others—and
at the same time avoid the twin evils of anarchy, on the one hand, and
tyranny, the usual reaction to prolonged anarchy, on the other?
This question is of enduring importance. It arises in varied form in
different contexts. Each generation faces it with respect to its own special
situation. Today in the United States the question is becoming acute as social
pressures, not always short of violence, push against laws and customs con¬
sidered unjust, evoking in turn counter pressures. Indeed, calls can be heard
today for direct action that shade into incitements to overt violence, some¬
times in the name of holy causes. In some quarters there is a growing
fearfulness and intolerance or dissent—the oxygen of a free society. In
other quarters it is believed that no limits can be placed on dissent if
it is sanctified by conscience—as if nothing can die from excess of oxy¬
gen. Hostile social tensions seem to mount with intellectual confusion. To
some extent they appear to feed on each other.
Unfortunately, the queston of the relation between freedom and order
is often answered unreflectively, solely in the light of the consideration:
Whose ox is being gored? Whose interests are at stake? In the South, until
quite recently, agitation for trade unions and Negro civil rights was often
opposed on the ground that it produced unrest and disorder in the com¬
munity. Yet today segregationists are invoking the rights they previously
impugned to justify highly inflammatory speeches calling for resistance to
recent civil rights legislation. Others, who have ardently defended the rights
of those agitating for civil rights, seem prepared to approve the suppres¬
sion of those agitating against civil rights.
Can we formulate a principled position, in the light of the democratic
philosophy, which will aid us at least in the theoretical resolution of the
problem? Such a position must be consistent not in the sense that the con¬
clusions will always be the same but in that the same considerations and
382
SIDNEY HOOK 383

reasons that serve as grounds for one position will also serve as grounds
for different decisions in different situations. Of course, such grounds will
never be sufficient for decisions in specific cases, since many factors enter
into the decisions, but they can help guide us in the quest for solutions.
As a foil for the development of my argument, I shall take as a point of
departure some cases of social conflict from the recent past.
A few summers ago Shakespeare’s Merchant of Venice was performed
on the mall of New York City’s Central Park. A proposal was made to
televise it so that millions could see and enjoy it—whereupon a furious
controversy broke out. The proposal was bitterly contested by many leading
citizens, not only by Jewish laymen and Rabbis but by others, some of
whom had been quite active in behalf of civil rights legislation. Their argu¬
ment was that the production unfairly stereotyped the Jew as a merciless
Shylock. To televise the play would outrage the sensibilities of a large
minority, and, by being brought into the very living rooms of the people
of New York, it would breed conflict between groups now living peacefully
side by side in the large polyglot city. Similar objections had once been
raised by leading Catholic laymen and priests to the showing of The Miracle,
and were to be repeated on the occasion of the performance of Hochhuth’s
The Deputy. Today leading Negroes, and some of their liberal white allies
in civil rights organizations, strenuously protest the revival and public
showing of Griffith’s The Birth of a Nation and the production of any film
in which members of their race are stigmatized.
The standard, and I believe justified, reply to these objections was that
the fears of evil consequences were groundless, that the dross in the reactions
of the audience would be purged in the fires of great art. For there is good
evidence that the exaltation of the spirit produced by great tragedy opens
the human heart to compassion and frees it from prejudice.
Supporters of the Shakespeare production also pointed out that a policy
that forbade the representation of anything that rubbed raw the sensibilities
of minority or majority groups in the community would prevent the circula¬
tion or dramatic rendition of many great works of literature and art. Some
of the writings of Mark Twain and Charles Dickens would have to be
withdrawn from public libraries. Even pictures of the crucifixion in mu¬
seums might come under the ban as prejudicial to a religious minority .
Such censorship is too great a price to pay to allay anticipatory fears,
real or fanciful, that portraying members of some groups in a degrading
light would generate ill-will. After all, somebody must play the role of a
villain! Even if some ill-will is generated among those who are either too
obtuse or too prejudiced to realize that somebody of a villain may be any¬
body, the compensating values of free, unhampered expression are of
overriding importance.
Consider now a situation in which the conditions of the problem are
changed. Suppose we are dealing not with controversial artistic representa-
384 3. THE CONSTRUCTION OF THE GOOD

tions but with cases of unmistakable group libel or of incitement to violent


action for which there are no alleviating or mitigating benefits. A few years
ago John Caspar, a virulent racialist, made incendiary speeches in New York
City hoping to provoke race riots. Not long after, the strangely misnamed
George Lincoln Rockwell, titular head of the American Nazi Party, de¬
fended the wisdom of Hitler’s policy in incinerating six million Jews and
urged its adoption in the United States. In Harlem in 1965, a Negro parti¬
san of Mao Tse-tung urged armed attacks against the police deployed to
keep the peace. In recent riots advocates have urged and carried out
attacks against police and firemen.
The question now is: Have they and others like them a right to speak
their minds or publish independently of the consequences of their speech
to public order? Can we ever limit or abridge these rights of expression
—the familiar freedoms of speech, press and assembly—and still consist¬
ently retain our allegiance to an open and democratic society? Must we
in other words regard the freedoms of the First Amendment or any of our
other constitutional freedoms as absolute? If not, how shall we interpret
them?
I take for granted here the validity of the many arguments formulated in
the works of Milton, Locke, Mill, and Dewey in behalf of the widest
freedom of expression. Freedom of expression is essential for the acquisi¬
tion of new truths and the continued testing of inherited truth; for self-
realization or fulfillment; for peaceful settlement of grievances and orderly
social change; for the proper functioning of democratic society in which
freely given consent is central; and for many other reasons. Despite the
impressive weight of all these arguments, we must bear in mind that in the
history of civilization, even of the enlightened West, the right to freedom
of expression has been recognized and accepted only recently and in
limited areas.
The question is whether despite the cumulative weight of all these argu¬
ments, freedom of expression can be abridged. This is not the same ques¬
tion as whether freedom of expression should be extended to Nazis, Fascists,
Communists and others who publicly proclaim that they intend to use the
freedoms of a democratic society to attain power and then destroy all
freedom of expression for others. Even if one firmly believes that freedom
of expression should be enjoyed by those who want to destroy it, one may
still hold that under certain circumstances limits may legitimately be imposed
on certain specific expressions.
Whatever theory is advanced to set limits to freedom of speech must meet
the absolutist” view in its two versions, one defended by Alexander
Meikeljohn and the other by Justice Black with an occasional assist from
Justice Douglas. Before discussing this view I wish to say something about
the distinction between legal and moral rights. The rights listed in the
First Amendment, indeed in all the amendments, are constitutional and
SIDNEY HOOK 385

legal. They therefore at first blush have a more restricted generality than
human rights, which must be justified on ethical grounds. Every declaration
of human rights holds up a standard to which positive law is morally sub¬
ordinate. For example, the United Nations Universal Declaration of Human
Rights explicitly declares that it is promulgated “as a common standard
of achievement for all peoples and all nations,” and that the rights it
enumerates “should be protected by the rule of law.” Now both in the sec¬
ond paragraph of its Preamble and in Articles 18 and 19 of the text, the
rights to freedom of conscience and religion and freedom of opinion and
expression—which are identical with the rights of the First Amendment—
are explicitly mentioned. Therefore it is obvious that, however it may be
with some other procedural constitutional rights, the First Amendment
rights, with the exception of the proscription of any established religion,
may be considered as moral or human rights.
Let us therefore, in a preliminary way, ask whether from the stand¬
point of ethical theory or moral practice we can reasonably maintain that
any right is absolutely valid independent of the consequences of its exer¬
cise upon society.
What do we mean by human rights? There seems to be a far wider agree¬
ment on what human rights are than on their definition. But by and large,
those who use the phrase “human rights” mean that a justified claim can
be made to the exercise of powers or to the receipt of goods and services,
benefits and concerns, or to the enjoyment of certain freedom from inter¬
ference. “A justified claim” means that upon sustained reflection we con¬
clude that we have an obligation or duty to acknowledge, respect, and as
far as it is in our power, to help realize such claims, regardless of whether
they are enshrined as “rights” in a constitution. The grounds of our obliga¬
tion or duty vary even more widely than differences in the niceties of defi¬
nition. They are theological, metaphysical, intuitive, scientific, and prag¬
matic (this classification is neither exhaustive nor exclusive—and need
not be considered here).
Now the relevant thing about our moral experience is that the problems
of moral choice that confront us cannot be characterized as problems in
which there is merely a conflict between right and wrong, or good and
bad. In that kind of situation, the problem is not to decide what to do but
merely to summon the resolution to do it. For, assuming that we have
already envisaged the means and instrumentalities required in the relevant
alternatives open to us, the problem in principle is already solved. No,
when we find ourselves in a moral quandary tormented by the question:
“What shall I do?” the agony of choice results from the realization that
right conflicts with right, good with good, and sometimes the right with
the good. We want both security and adventure and can’t have both. We
want to be just but discover we cannot be just without being cruel. We want
to be loyal, but if we are, we can’t be truthful and vice versa. We want to
386 3. THE CONSTRUCTION OF THE GOOD

be free to live our life but find that we cannot do so except on the ruins
of another’s life. These are the typical moral dilemmas. To the extent that
we resolve moral conflicts, one right or good is sacrificed to another.
Further, in the course of our moral career, we discover that there is no
one specific right or good or value which is always preferred in all cir¬
cumstances of conflict, there is no one alleinseligmachende Wert—no
specific all-sanctifying value—that one upholds at all costs in all circum¬
stances.
There is no time or need to make an inventory of the goods and rights
in our moral economy; but whether we take as our supreme value knowl¬
edge or truth or beauty or love or friendship, there are some situations in
which their pursuit may have to be morally condemned. Knowledge for
its own sake certainly ranks high in the hierarchy of values of any liberal
mind, but a scientist who experimented on a human being in order to dis¬
cover how long he could endure certain types of torture would be ad¬
judged either criminal or insane. And even if he argued, as some Nazi
physicians did, that knowledge derived from immersing victims in icy water
could be useful in aiding survivors or shipwrecks and planes ditched at
sea, we would not regard his statements as mitigating. Similarly, a man
who tells the truth all the time, especially when he is not asked, is before
long avoided as a moral pestilence. The Marquis de Sade may have ex¬
tended the ranges of sexual sensibility, but he was a moral monster whose
words and actions may have inspired others to horrifying cruelties of per¬
version on children. Those who out of transcendent love or friendship have
betrayed their comrades and country fighting in a valid cause are not there¬
fore absolved of infamy.
Santayana once defined a fanatic as a person who, having forgotten his
goal, redoubles his effort. If this were so, a fanatic would be more stupid
than dangerous. A more adequate definition of a moral fanatic is a person
who never forgets his supreme goal and never permits others to forget it—
who believes that his goal justifies the use of any means to achieve it, and is
therefore blind to the other ends destroyed by the consequences of his
means. Because such a person is a logical lunatic, he becomes a moral luna¬
tic.
This analysis holds, it seems to me, even for second order rights that are
not completely formal. When we say that a man or citizen has an absolute
right to equal consideration by the law or by the state, or that it is abso¬
lutely wrong to impose needless suffering or unnecessarily cruel and un¬
usual punishment for a crime, no specific action in indicated. Equal con¬
sideration is compatible with any mode of identical treatment—“equal
treatment” and “equal mistreatment,” and what constitutes “needless” or
“unnecessary” suffering, admitting that pain is intrinsically evil, depends
upon the situation in which there are present other intrinsic goods and evils.
If what I have said is substantially true—and there are refinements to be
SIDNEY HOOK 387

considered—is it at all plausible to hold that although ethically there can


be no absolute human rights, there can or should be absolute constitutional
rights, political or juridical freedoms, that may never be abridged in any
circumstances?
At least three important considerations about the concept of freedom
seems to me to be overlooked by those who regard themselves as absolutists.
First is the failure to see that the concept of freedom in law and/or ethics
is obviously a normative one, and therefore no definition of freedom as
merely the power to act or to effect one’s desires without let or hindrance
by others, can be adequate. For such a definition makes freedom the power
to do anything one pleases, and no reasonable person can ever approve
of such freedom unless he knows how it is to be exercised. Otherwise we
could not limit the freedom of fools, criminals, and madmen. Not all free¬
doms are desirable. Freedom in a political and moral context refers to
more than voluntary action. The cry for freedom is always a demand for
a specific freedom or set of freedoms, for some particular power or powers
we believe should be gratified, for something that can be defended as rea¬
sonable, for a justified claim. It is inescapably normative.
Second, the logical correlative of every desirable freedom, of every con¬
sidered or reasonable demand for freedom, entails a demand for the restric¬
tion of other people’s freedom in some relevant respect. My demand for
freedom of speech is also a demand that the freedoms of those who move
to prevent me from speaking be curbed. The union’s freedom to strike
means that the freedom of others to enjoin, punish, or fine those who strike
must be restricted. It means the prevention of forced labor; and such pre¬
vention is an interference with the freedom of others to coerce their work¬
ers. It is in this sense that Bentham’s dictum “Every law is contrary to
someone’s liberty” is to be understood. It is misunderstood by ritualistic
liberals who believe that if one is tolerant one cannot consistently be in¬
tolerant of the actively intolerant.
Third, the obvious political facts of life reveal that even the freedoms
of which we approve often conflict. We find ourselves committed to in¬
compatible freedoms. It is all very well to say that we have a right to life,
liberty, property, and the pursuit of happiness. But the achievement of one
may preclude the enjoyment of others. This is true not only for individual
experience but for the community values as well. The ends or interests or
values expressed in the Preamble to the Constitution—unity, justice, do¬
mestic tranquillity, the common defense, the general welfare, the blessings
of liberty—may not be compatible with each other at any given time. To
establish justice in the South may require the disruption of domestic tran¬
quillity. To insure the common defense may require conscription and the
curtailment of liberties for a certain time in certain areas. Even more to the
point, the very means of realizing these ends or interests—the rights en-
numerated in the Bill of Rights and other articles of the Constitution—may
388 3. THE CONSTRUCTION OF THE GOOD

conflict. This is apparent when we examine some of the specific rights,


explicit and implicit, therein enumerated. Freedom of speech, press, and
assembly may adversely affect a man’s right to a fair trial. Freedom to
know may violate the right to privacy, the right to be left alone or in peace
with our grief, our shame, or our failure. It can even be argued that there
is a conflict of rights in the very First Amendment, not yet discovered by
the Supreme Court, between the freedom to exercise one’s religion and the
prohibition of an etablished religion, for the free exercise of some religions
may demand an establishment.
The inescapable conclusion of all this is that under these circumstances
we cannot have more than one absolute right. For if we have two or more
allegedly absolute rights, who can guarantee that they will not conflict? Who
can offer a consistency proof immune from the possibility of historical
counterinstances?
All this may seem truistic and possibly trite. But truisms, even taut¬
ologies, have point and importance when counterposed to absurdities. We
live in an era of doctrinaire intransigeance in which the very words “com¬
promise,” “balance,” and “moderation” have acquired the same invidious
connotations as the words “collaboration” and “opportunism.” The po¬
sition developed so far has been denied, and indeed by one of the most
distinguished jurists in the United States, Mr. Justice Black, as well as by
one of the best-known champions of civil rights, the late Dr. Alexander
Meikeljohn. Justice Black is regarded as the spokesman of a group that
often constitutes a majority on the bench even though its support of his
theoretical point of view is qualified. Both Justice Black and Dr. Meikel¬
john have been supported by some scholars in the field of law and political
philosophy, particularly in their critique of the doctrine that rights and goods
must be balanced against each other. If I concentrate on Justice Black and
Dr. Meikeljohn, it is because both have been cited as the spokesmen of the
“absolutist” position—and because they are the ablest and most influential
representatives of the absolutist view, although, as we shall see, their posi¬
tions are different except on the central point.

In a famous and oft-cited lecture, Justice Black declared: “It is my belief


that there are ‘absolutes’ in our Bill of Rights and they were put there on
purpose by men who knew what words meant and meant their prohibitions
to be absolute” (35 New York University Law Review, 865 j.1
Justice Black has retreated not an inch from these words but has re¬
peated and defended them against criticism. Well, then, what about the
obvious conflict of rights to which I have referred? To his credit, Justice
Black grapples with the question and discusses some of the very conflicts I
have cited. He himself declares: “I want both fair trials and freedom of
the press.” And suppose they conflict? Justice Black answers in effect: It
is impossible for them to conflict to a point where a fair trial becomes im-
SIDNEY HOOK 389

possible. In his own words: “I do not think that anyone can say that there
can be enough publicity, completely to destroy the idea of fairness in the
minds of people, including the judges” (37 New York University Law Re¬
view, 575, my italics).
On the face of it this seems an extraordinary judgment in view of the
number of cases in which claims have emphatically been made that what
Justice Black declares impossible has repeatedly occurred—cases in which
the Supreme Court itself has so held. More pertinent to our analysis is
Justice Black’s assumption that unless publicity completely destroys the
idea of fairness, there is no need to be alarmed that justice will not be done.
Apparently, according to Justice Black prejudice must be complete to be
objectionable. Suppose publicity prejudices a judge or a jury only “90
percent,” however that is to be determined. Would we reasonably consider
a trial fair in such circumstances? Suppose only one juror were prejudiced!
Would we not say that this is one too many for the conduct of a fair trial? Or
must we conclude that all jurors were completely prejudiced, and the
judge as well, before concluding that due process has been violated? Ac¬
tually, Justice Black’s position rests on an evasion. He assumes as a fact
that it never can be established that publicity completely destroys the pos¬
sibility of a fair trial, that Dean Griswold of the Harvard Law School and
other jurists are mistaken in their contention that “Lee Harvey Oswald
could never have received a fair trial anywhere in the United States” be¬
cause of the television publicity. Surely it is not logically absurd to imagine
a situation in which publicity has this effect. Which right in such a situa¬
tion would Justice Black declare to be absolute—the right to freedom of
speech or the right to a fair trial, i.e., the right not to be deprived of life,
liberty, or property “without due process of law”?
There are some absolutists who, at this point, differentiate themselves
from Justice Black and make a distinction between political speech and
speech about other matters, and declare that only the first is absolute and
never to be abridged. This is the position of Dr. Alexander Meikeljohn. It
is also, with some inconsistency, the position of a harsh critic of my criticism
of Justice Black, who seeks to reformulate Black’s position in “a much
more defensible form.” I shall address myself to these points later on.
In justice, however, to Justice Black and my criticism of his thought, and
as evidence of the climate of legal thought in the citadel of rutualistic lib¬
eralism, I must present some supporting details of my characterization of his
position. Justice Black refuses to accept the distinction offered by Meikel¬
john and, by implication, by those who would save him from himself by
putting wiser words in his mouth.
This is apparent in Justice Black’s discussion of what until now has been
the paradigm case of speech that is not privileged, that does not enjoy con¬
stitutional protection. It is the case, cited by Justice Holmes in Schenck v.
United States (249 U.S. 47), speaking for a unanimous court, of a man
390 3. THE CONSTRUCTION OF THE GOOD

who deliberately and falsely shouts “Firel” in a crowded theater and thereby
precipitates a disastrous panic. Now Justice Black, like any sane man, be¬
lieves that such an action is wrong and should be punished. But his argu¬
ment is that the person should be punished not for his speech, not for what
he shouted, but merely because he shouted, because he created a dis¬
turbance and violated the right of private property.
Nobody has ever said that the First Amendment gives people a right to go
anywhere in the world they want to go or say anything in the world they want
to say. Buying the theatre tickets did not buy the opportunity to make a speech
there. We have a system of property in this country which is also protected by
the Constitution. . . .
That is a wonderful aphorism about shouting “fire” in a crowded theatre.
But you do not have to shout fire to get arrested. If a person creates a disorder
in a theatre, they would get him there not because of what he hollered but
because he hollered. They would get him not because of any views he had but
because they thought he did not have any views that they wanted to hear
there. That is the way I would answer: not because of what he shouted but
because he shouted [loc. cit., at 578-89],

Well, then, what if a person shouted fire not in a private theater but in a
public meeting place, in a church or school where there are no private
property rights to be considered. Would his words be privileged?
What if, in a theater, someone for the thrill or kicks of it falsely whisp¬
ered “Fire” or flashed the words on the screen: “Fire! Run for your life!”
What if there was no shouting at all?
Finally, what if a person shouted “Fire”—and it turned out to be true?
There was a fire! Would we then arrest him for shouting, for creating a dis¬
turbance even if it was a timely warning?
As if he were determined to wring the last measure of absurdity out of his
absolutist position, Justice Black goes on to assert that there should be no
libel or defamation laws in the United States—that indeed at the time the
First Amendment was drawn, the framers intended that there should be
no such laws, “just absolutely none so far as I am concerned” {loc. cit., at
577). The dogmatic certitude with which this discovery is proclaimed we
leave to the professional historian. More relevant to the understanding of
the contemporary absolutist position is Justice Black’s view of how the ex¬
pression “freedom of speech” should be construed with respect to its scope
and privilege.

I do not hesitate to say, so far as my own view is concerned, as to what


should be and what I hope will some day be the constitutional doctrine, that
just as it was not intended to authorize damage suits for mere words as dis¬
tinguished from conduct so far as the Federal government is concerned, the
same rule should apply to the states [loc. cit., at 578].

In other words, the laws against libel, slander, and defamation should
be abolished in every jurisdiction of the land. That these laws need reform
or reformulation to safeguard against abuse goes without saying. But to
SIDNEY HOOK 391

wipe them out altogether is to disregard the obvious fact that on a man’s
reputation may depend not only his honor and that of his family but his
livelihood and the welfare of his dependents, sometimes his very life. The
irreparable damage to reputation resulting from calumny and malicious
slander has been one of the enduring themes in world literature. Even
criminals and lawbreakers are aware of the value of a good name. Many
a person has suffered more anguish from the loss of reputation, from public
shame, than from physical or monetary punishment. It is Iago, the arch
rogue fearing exposure, who utters the lines:

Good name, in man or woman, dear my Lord,


Is the immediate jewel of their souls.

Before considering the second and more restricted form of absolutism,


a word should be said about one attempt to meet these strictures by shift¬
ing the whole question to how we shall define the terms “speech,” “abridge,”
and “law.” The command that Congress shall make no law abridging free¬
dom of speech and press is absolute, to be sure, but not everything that
is said is “speech” and not everything that is published is “press.” “Speech”
and “press” are redefined in such a way that all expressions of speech and
press that are deemed constitutionally objectionable are ruled out as
“speech” and “press” and considered as actions that fall within the reach
of sanctions. This makes the position invulnerable but also comical. What
is worse, since there are no clear criteria for determining when “the freedom
of speech” which is not absolute becomes freedom of speech which is abso¬
lute, intellectual confusion results. Thus the utterance or publication of “fight¬
ing words,” “obscene words,” “defamatory words,” and “traitorous words,”
which the law has until now (despite Justice Black) held to be punishable,
is ruled out as not belonging to the realm of discourse at all. With a
straight face those who hold this view must say to a man who finds himself
in jail or slapped with a fine that he is not being punished for his speech
even though on all accounts he has done nothing else except speak. This
is hardly more bewildering than to tell him his rights are not being
“abridged” or that he is not in “prison.”
To be consistent, proponents of this view must hold that the meaning of
the other key provisions of the First Amendment must be redefined, too.
The right to the “free exercise of religion,” considered so important by the
framers that it is listed even before the right to freedom of expression, must
be deemed absolute, too. What shall we say then to those individuals who
have been convicted of sacred bigamy, of the use of rattlesnakes in their
rituals, of voodoo bloodletting, of refusing to permit the administration
of life-saving drugs or blood transfusions to their children? Obviously by
redefinition the proponents of absolutism must assure these individuals
that they are not exercising their “religion” or that their freedom to exer¬
cise religion is not really being “abridged.” In other words, we must add
392 3. THE CONSTRUCTION OF THE GOOD

insult to injury by implying that these people are not sincere martyrs to
their religion but simply criminals masking their actions falsely under the
labels of “religion.”
If these remarks are valid, it undercuts the whole position of those like
T. I. Emerson who contends in his Towards a General Theory of the First
Amendment (New York, 1966) (1) that the general theory of the First
Amendment is based on a “strict adherence to the distinction between ex¬
pression’ and ‘action’” (p. 90), (2) that “expression” should never be
punishable by any actions, civil or criminal, that only “actions” may be
legally regulated and punishable, and (3) that some expressions may be
legally punishable, e.g., certain threats, solicitations, libels, obscenities, and
so forth. This inconsistent triad of propositions (the truth of any two of
them entails the falsity of the third) seems to me an outrage both of com¬
mon usage and common sense. It is not mitigated by the practice of putting
quotation marks around the term “speech” when some particular expres¬
sion of speech is deemed legally objectionable (like incitement to violence),
and then classifying this “speech” as an action; nor by putting quotation
marks around the term “action” when certain actions like mass picket¬
ing or joining a terrorist organization are deemed not legally punishable
and then classifying these “actions” as expressions or forms of speech.
It seems to me simpler, less confusing, and intellectually more straight¬
forward to regard all expressions and acts of speech as forms of action or
human behavior, using various criteria to determine the degree of freedom
they should enjoy.
Dr. Alexander Meikeljohn has not resorted to such desperate strategems.
Unlike Justice Black, he distinguishes between “private speech” and
“public speech.” The first is not privileged independently of its conse¬
quences. Citizens can be punished for such speech or deprived of the
opportunity to speak by due process of law. Dr. Meikeljohn goes very
far in approving of restrictions of speech in this sector. He denies that any
publication, including radio and television, “engaged in making money,”
which uses speech to enslave “our minds and wills,” is entitled to the protec¬
tion of the First Amendment (Political Freedom, [New York, 1960], p.
87). On the other hand, public speech which addresses itself to public
affairs, public policy, and issues is absolute. There can be no restrictions what¬
soever of such speech since any government that rests on the consent of the
governed must permit all the winds of doctrine to blow in the marketplace;
such a government must allow all views and policies to be heard. No matter
what the clear and present danger of speech about any public policy may be,
in this view it is absolutely privileged. All other rights may be prima facie,
to use Ross’s distinction, but this one is not only presumptively but un¬
qualifiedly valid whatever the consequences. In fact, however, Meikeljohn
is confident that it is not speech in the public domain that can ever threaten
or undermine our free society but rather the doctrine of “clear and present
SIDNEY HOOK 393

danger,” elaborated by Justice Holmes, Brandeis, and Frankfurter in


defense of that society. For this doctrine mistakenly sets limits on what in
a free society should never be limited.
This seems to be the view also of one of the defenders of Justice Black
against my animadiversions (A. S. Kaufman, 52 Journal of Philosophy, pp.
241 ff.) Coolly disregarding the statements Black has repeatedly made
and also reaffirmed in reply to criticisms, Kaufman flagrantly misstates my
position and then restates Black’s position, in effect bringing it closer in
line with that of Meikeljohn. “Black’s central claim could be better expressed
in the following way: It is prima facie unconstitutional legally to condemn
and punish one who incites illegal action when the inciting circumstance
is talk about public affairs” (loc. cit., at 241).
Now it is quite true that in any free society a great deal of latitude must
be enjoyed in the advocacy of public policy and in all public discussion.
As I have argued elsewhere, the freedoms of expression and communica¬
tion are of strategic importance. Therefore the relevant rights of the First
Amendment and the other amendments are to be regarded as strategic rights
or freedoms for which we must be willing to pay a high price. This is the
rule. But here, too, the rule must be enforced intelligently. The intelligent
application of the rule need not countenance speech or any other mode of
expression that goes beyond advocacy to the incitement of violence and il¬
legal action. Such incitement may be legally proscribed and punished even
if a public issue is being debated because the presupposition of a free society,
whose general rule is freedom of speech, is that the consequences of such
speech will permit the settlement of issues by free and peaceful discussion,
not by force or violence. One may legally advocate repeal of the conscrip¬
tion law or denounce the ban on racial segregation, but one may not urge
conscripts to refuse to serve or to desert, or incite a crowd to lynch or
riot. It is only the political innocence of the absolutists of all schools that
explains their failure to realize that the themes and issues about which
crowds are usually incited to violence fall within the sphere of public affairs.
Lynch law is preeminently a form of social action that unfortunately almost
invariably grows out of the words of agitators discussing a public issue.
That is why it is so bewildering to read the statement in one of Justice
Black’s opinions, Justice Douglas concurring: “I believe that the First
Amendment forbids Congress to punish people for talking about public
affairs whether or not such discussion incites to action, legal or illegal
(Yates v. U.S., 354 U.S. 298, my italics).
“Talking about public affairs” is precisely what the white racialists like
Caspar and Rockwell and the black racialists like Williams of Radio Free
Dixie do. If talk cannot literally kill, it can trigger the action that does kill.
The presupposition of much of the discussion by absolutists is an ex¬
treme kind of Cartesian dualism not only between ideas and action but be¬
tween words and action. Justice Holmes was wrong in asserting that
394 3. THE CONSTRUCTION OF THE GOOD

“every idea is an incitement” and it is doubtful that John Dewey was right
in holding all ideas to be plans of action. But it is not wrong to hold that
some ideas are plans of action, that some ideas expressed in words are
incitements to action, and that some locutions or speech acts are part of
an action or performance which if permitted to run its course would event¬
ually end in disaster. It is the objective situation, not the form of the sentence,
that determines whether words are primarily an expresion of opinion, used to
induce a change of opinion, or an incitement to action.
An illustration as good as any are the following words uttered at the time
of the Watts riots:
We are facing a future wherein the streets shall become like rivers of blood.
Let us be prepared to fight to the death, organize, arm, learn to shoot, and
handle explosives. When the impending showdown comes, use the match and
the torch unsparingly. The flame of retribution must not be limited to urban
buildings and centers, but the countryside must go up in smoke also. Remember
the forests, the fields, and the crops. Remember the pipelines and oil storage
tanks. Yes, let it be known to the world that we shall meet their sophisticated
weapons with the crude and simple flame of a match.
—R. F. Williams on Radio Free Dixie, August 21, 1965

Confronted by such talk about public affairs, it is hard to understand


what prevents common sense from breaking out except the fear that the
difficulty in drawing an exact line between protected and unprotected speech
will result in no line being drawn, so that any speech becomes problematic.
This is simply inviting intelligence to abdicate in favor of a few formulas
to be mechanically applied. Part of the failure flows from the description of
the conflict of freedoms as a conflict between the freedom of the individual
and the interest or security of the state or government. But this is a fiction.
The government or state has no political rights which are not reducible to the
political rights of its citizens as individuals or groups. Only human beings
have political rights, and the so-called conflict between the freedom of the
individual and the public interest, when it is genuine, is a conflict between
the freedom or right of some particular individual or group of individuals and
the freedoms and rights of the rest of us. Because those conflicts are inescap¬
able, we make certain rights and freedoms strategic in the political process in
order to maximize the desirable freedoms for all members of the community
and in order to insure the greatest degree of participation and freely given
consent in forging a public policy out of the clashes of public opinion. I
repeat that these rights are strategic or first in importance but not absolute.
They are first not because they are in the First Amendment, an historical
accident, or even because they are in the Constitution at all, but because of
our commitment to a free or democratic or self-governing society. However,
sometimes in the interest of preserving the entire structure of our desirable
freedoms, we may be compelled to abridge one or another of our strategic
freedoms for a limited time or in a limited place. When we balance one free-
SIDNEY HOOK 395

dom or right against others, we seek the action or policy that among other
things will best preserve the entire system of operating, desirable freedoms
which make possible the growth of new freedoms. And here there is no room
for the substantive absolutes of Black, Meikeljohn, or any of their sup¬
porters.

To this the objection has been made that the entire notion of “balancing”
of rights against rights, of interests against interests (whether we speak of the
rights and interests of individuals or the government) is mistaken. In this
view, whatever balancing was needed, in consequence of the possible conflict
among the ends or values of the Preamble of the Constitution, was already
done by the provisions of the articles and amendments. Once the Constitution
has been completed, “such ‘balancing’ ceases to have any meaning what¬
ever, except as amendments to the structure are thought to be needed.”
At any given time, and no matter what the case at bar is, “there is no
power, nor is there any need to ‘balance’ the provisions of the Constitution
against one another, unless amendment of the Constitution itself is in
question”—a process beyond the jurisdiction of any court, congress, or
executive (Meikeljohn, 40 California Law Review, 9).
The question of “balancing” rights and interests is complicated by
antecedent commitments concerning what is entailed by the existence of a
written constitution. I want to make a few relevant observations about this,
reserving the detailed consideration of it for elaboration elsewhere.
(1) The notion that human rights and interests have once and for all
been balanced in any document is preposterous on its face. It would require
one to hold that those who drew up the Constitution could have anticipated
all the future situations of conflicting rights and interests that have arisen
in consequence of the industrial and technological revolutions and other
profound social changes. Or it requires the assumption that whenever a
conflict is disclosed (and almost every important legislative measure, e.g.,
the civil rights bills, and almost every case at bar involve such conflict),
it is necessary to go through the complicated process of amending the
Constitution, with the absurd result that no legislative action or judicial
decision could be taken without sanction from a Constitutional Convention
in continuous session, whose balancing would be subject in any case to
the interpretation of the Supreme Court.
(2) The theory that rights and interests have been sufficiently balanced
to permit the resolution of all conflicts by reference to the text presupposes
that, in Justice Black’s words, “the principles of the First Amendment
are stated in precise and mandatory terms and unless they are applied in
those terms, the freedoms of religion, speech, press, assembly, and petition
will have no effective protection” (Wilkinson v. U.S., 365 U.S. 422). But
it is simply false that these principles are stated in “precise” terms, as the
Supreme Court’s own shifting and inconsistent decisions on the meaning of
396 3. THE CONSTRUCTION OF THE GOOD

establishment and “free exercise of religion” amply indicate. And it is a com¬


plete nonsequitur to infer that unless these principles are stated in precise
terms, the freedoms in question will not be effectively protected. Not even
Justice Black could maintain that the principles expressed in the other
amendments, for example, the Fifth Amendment, have a precise meaning.
(How precise is “just compensation” and “self-incrimination ?) Yet, if
anything, there is widespread and reasonable belief that the rights and in¬
terests covered by the amendment are so well protected as to appear over¬
protected.
(3) Even if the principles of the First or any or all amendments were
stated in “precise” and “mandatory” terms, this would not exclude the
possibihty that the principles were (a) inconsistent with each other and
(b) not sufficiently determinate or complete to settle possible conflicts. Not
only is it demonstrable that the principles of the amendments may conflict
with each other, even the allegedly precise terms of the First Amendment
can be so construed that its rights can conflict with each other.
(4) Those who deny that the balancing of rights and interests is legally
or constitutionally justifiable are (a) obviously and (b) radically incon¬
sistent.
(a) By “obviously inconsistent” I mean that there are cases in which the
absolutists have concurred in opinions in which the Court clearly balanced
conflicting rights and interests. For example, Justice Black concurred in
the mandatory Sunday Closing Law cases. In these cases the Court held
that despite the obvious religious origin of the statutes and the hardships
they imposed on the free exercise of the religion of certain Orthodox Jews,
Moslems, and Christian Sabbatarians, the interest of the community in
enjoying a common public day of rest must be given overriding consider¬
ation. And in earlier cases (Schneider v. United States, 308 U.S. 147, and
Cantwell v. Connecticut, 310 U.S. 296), Justices Black and Douglas con¬
curred in decisions which explicitly used the logic and language of “bal¬
ancing” in weighing, on the one hand, the right of an individual to com¬
municate and, on the other, the right of his neighbors not to have their
peace and slumber disturbed. That the laws controlling conduct only
“indirectly” affect “freedom of speech” is unimportant to the principle
of balancing. An “indirect” effect can be very powerful.
(b) The position of the absolutist is “radically inconsistent” because, since
absolutists are compelled to distinguish “speech” from speech, they must
engage (if they are not to be utterly arbitrary) in the “difficult and delicate”
task of weighing the circumstances and consequences of various types of
utterances before bestowing the accolade of absolutely protected “speech”
on them. In other words, if the absolutists are saying (as one of their ex¬
tremist defenders interprets them) that once the “scope” of the First Amend¬
ment is “properly determined,” then “whatever falls within that scope
should be regarded as having an unconditionally obligatory character”
SIDNEY HOOK 397

(Frantz, 71 Yale Law Journal, 1432), it is obvious that the determination


of the “proper scope” of speech, to be reasonable, must depend upon the
process of balancing conflicting rights and interests. The “proper scope”
cannot be fixed forever.
(5) The question of balancing is often confused with the question of
who is to do it—the Court or Congress. The Court must do it with respect to
issues raised by state legislation and the decisions of courts of lower in¬
stance, but whether and when the Court should also do it with respect
to Congressional legislation depends upon one’s views of the nature and
limits of judicial review in a free society.
(6) Even if one rejects the Jeffersonian view that judicial supremacy
over Congress is judicial usurpation and entrusts the balancing of rights
and interests ultimately to the Supreme Court, one must avoid the notion
that balancing requires that all questions be raised afresh, that nothing is
ever presumptively settled, that all the rights and interests balanced against
each other are of equal weight. The commonest mistake made by the ab¬
solutists is to believe that unless the “preferred,” or what I call the “strate¬
gic,” rights are absolute, they count for no more than other rights. Actually,
the intrinsic and instrumental value of any strategic right is of such import¬
ance that it enjoys a presumption of validity which we only reluctantly set
aside in the face of weighty considerations that other important values would
be prejudiced by indulging it. Thus, for example, we regard freedom of
religion as so important that we lean over backward not to enforce mail
fraud statutes against religious practitioners who promise eternal fife in the
next world to those who deed over to them all cash and real property in
this world. Were this scheme floated not under the sanction of religion
but merely on the basis of a promise to cure cancer, the law would step in.
All the arguments against the process of balancing turn out in the end
to be variations of the view that it is dangerous and ultimately disastrous
to make exceptions to general rules. Many effective replies can be made to
these arguments, but it is enough to say that granting the danger of making
exceptions to general rules, it is sometimes more dangerous and harmful
not to make exceptions.1
When we turn from the legal and jurisprudential questions to the basic
moral one—what determines our choice among conflicting goods and
rights?—we find that it is very difficult to formulate a principle that we can
show actually guides our moral choices. And this is true whether we sub¬
scribe to a system of deontological ethics or one of ideal utilitarianism in
which justice is itself a good. The choice depends so much on the situation,
so much on the vast complexities of historical context, and is complicated
by the fact that in small matters we feel morally justified in taking “a moral
holiday.”
The United States is not only a democracy but also a community in
which decisions must be made affecting political rights and goods as well as
398 3. THE CONSTRUCTION OF THE GOOD

other rights and goods. The common or public good is not only a pro¬
cedural matter of equal consideration of all rights and interests but at
any specific time also requires a decision about which set of rights and
interests is to be furthered. Everyone appeals to “the common or public
good,” and most people agree that one or another action (like, for instance,
the New York City transit strike of 1966) is against the common or public
good even when they cannot define what they precisely mean.
No one has codified, or can codify, the middle order range of principles
by which we strike the balance between conflicting rights and interests,
and extend as well as delimit the areas in which they are respected. I do
not believe that we can find one formula to cover all cases in which there is
conflict between the rights to speech and security, to speech and privacy,
to a fair trial and to freedom of press, and so forth. Whether we invoke the
“clear and present danger” formula of Holmes and Brandeis in its different
interpretations, or Justice Hand’s principle applied in the Dennis Case,
“the gravity of the evil discounted by its improbability,” or the distinctions
between “expression” and “action” or “behavior”—no one criterion, with¬
out obvious abuse of language, can apply to all the situations in which re¬
flection approves of abridgment.
Nonetheless certain broad ethical principles can sometimes guide us in
making proper distinctions or in setting limits to the scope or sphere of
rights in dispute. I want to conclude by stating one of these principles
which today may help us to make some helpful distinctions between the
areas in which discrimination is morally permissible and areas in which
it is not.
There is a great deal of confusion about the rights and wrongs of
discrimination. We are vaguely aware that there is a profound difference
between irrelevant discrimination against an individual as a member of
a group and discrimination in favor of an individual based on a rele¬
vant, noninvidious differentiation. Can we find some regulative principle
which will help clarify and order these distinctions? I believe we can if
we interpret the concept of democracy not in narrow political terms but
as a concept of social ethics. If we then define the fundamental ethical
principle of democracy, viewed as a way of life, as equality of concern on
the part of the representative agencies of society for all individuals to fully
develop as responsible persons, then we have some guide in determining the
locus and limits of permissible and impermissible discrimination. Essential
to the concept of a person is the right of choice, the freedom to develop
one’s tastes and judgments, and the overall pattern of one’s life. Conse¬
quently, in those areas of our experience which are inherently personal—
friendships, family, cultural interests—we must be free to discriminate,
and legally protected in that freedom, even when morally our judgments
may be defective. For freedom to develop one’s personality carries with it
the freedom to blunder and err. However, in fields in which discrimination
SIDNEY HOOK 399

prevents the development of personality in other human beings, discrim¬


ination is not only morally wrong, it may be legally outlawed. What makes
it morally wrong to discriminate on the basis of irrelevant and arbitrary
considerations in the domains of citizenship, vocation, schooling, housing,
and so forth, are the obstacles such discrimination normally sets up to the
development of the responsible person by the consequent restrictions of
opportunities.
Equality of concern does not entail equality of specific treatment in
situations in which we are dealing with unequals in unequal situations
any more than the goal of health for every human being requires the same
regimen. The analogue to the continuous practice of scientific medicine to
discover what is the best way to make each and every individual healthy
is the continuous use of our creative intelligence to help every individual
become a responsible person.

NOTE
1. For a more extensive discussion, see my The Paradoxes of Freedom, Berkeley,
The University of California Press, 1964, pp. 46 ff.
JUSTICE AND RATIONALITY
Charles Frankel

Long before the word became fashionable, Ernest Nagel’s philosophy


was an example of a philosophy engage. Its antiseptic, critical quality is the
expression not only of an intellectual passion for clarity and honesty but
also of a profound moral impulse. Nagel has never thought that human
reason can produce a world free from sorrow. But he does think that much
human grief could be avoided if men used reason, and he is disturbed that
philosophy, which should be the guardian and goad of reason, has so often
contributed to the disuse and misuse of it. One purpose of his thinking has
been to indicate why philosophy has served the cause of reason badly, and
to suggest what a critical and therapeutic philosophy might do to repair
the damage.
It would, of course, be a caricature of the variety and subtlety of Nagel’s
philosophy to suggest that any single teaching dominates it. But one of
his most important teachings with regard to the philosophy of science has
been that mistakes in the interpretation of science come from imposing
standards on science that are not themselves generated and corrected in
the course of actual scientific inquiries. And mistakes in other fields like
the law and morals, with all the costs in human suffering that follow from
them, also derive, he has suggested, from the same sort of error. Philos¬
ophers, if I rightly understand Nagel, have failed to serve reason or human
welfare because they have projected conceptions of the nature and vocation
of reason, and have set standards for its guidance, that are irrelevant and
inoperable in the actual contexts of human thought and action.
No theme in philosophy better illustrates this point, perhaps, than the
classic theme of justice. What is justice? Why should men practice it? Is it
an ideal whose validity can be rationally demonstrated, as some philosophers
have hoped, in the way in which a geometric theorem can be demonstrated?
If not, does this mean, as other philosophers have concluded, that justice
is merely a high-sounding name for the will of the stronger or the satisfac¬
tion of our own prejudices? What, indeed, is the relationship of “reason”
or “rationality” to “justice”? What can we mean by “rational” or “reason¬
able”—or can we mean nothing at all—when we use these terms to char-
400
CHARLES FRANKEL 401

acterize the decisions and actions that men take when they speak of doing
justice? It is to these questions that this paper is devoted.
Interestingly enough, classic rationalist answers to these questions have
been receiving increasing attention and respect in recent philosophy. A
growing number of philosophers who have an empiricist and analytic
orientation, indeed, have begun to espouse such answers. Social-contract
theory in particular has been revived and restated in a sophisticated twen¬
tieth-century form, often with help from arguments drawn from the theory
of games. Such reformulated rationalist theories often propose not only a
“rational” and universal standard of justice, free from the taint of any
man’s or group’s special moral point of view; they also offer a general
vindication of constitutional democracy which shows that this system is
alone a legitimate social order because it alone incorporates universally
acceptable principles of justice.
The intellectual seriousness and the sense of social commitment that
informs these efforts are admirable, and the questions they raise are of
considerable importance. As a first step in our consideration of the rela¬
tionship of “justice” to “rationality,” it will be helpful to examine an in¬
stance of these rationalist arguments.

Are there universal principles of justice whose validity can be demon¬


strated by deduction from morally neutral and universally acceptable
premises? I shall restate an argument to which wide attention has been
given, which purports to answer this question in the affirmative.1
Let us first be clear what we are talking about. In order to keep the
subject manageable, the argument will be concerned with only one aspect
of the concept of justice. We commonly speak of “just” men, of “just”
actions or decisions, and of “just” social institutions or practices. We shall
be concerned with “justice” only as it is used to characterize social insti¬
tutions or practices. In any society, that is to say, established institutions
and practices determine that different members of the society will receive
different offices and positions, rights and duties, rewards and penalties. We
shall be concerned with “justice” and “injustice” in the sense in which these
words are used to characterize such arrangements for distributing these
benefits and pains of social existence.
Now in most complex and sophisticated societies, there are men who
attack some or all of these arrangements as unjust. Nor do they complain
simply that they do not receive what is rightly theirs in accordance with
established rules. They complain that the rules systematically deny them
(or others) their rights or give to certain members of the society more than
402 3. THE CONSTRUCTION OF THE GOOD

their fair share. They ask, therefore, that the rules be remade. And others
in such societies commonly disagree with them. Such arguments, it is clear,
involve contending conceptions of justice. Is there some neutral procedure
that could be used to arrive at a common conception of justice and thus
to settle such disputes rationally?
The argument under consideration is intended to answer this question.
It proposes that we conduct a mental experiment. Let us imagine a group of
wholly rational men in which such a dispute has arisen. They are rational
because they really know their own interests, because they exercise in¬
telligent forethought, because they stick to a plan of action without being
turned aside by momentary impulses, and because they are not moved by
envy. In Professor Rawls’ words, “The bare knowledge or perception of
the difference between their condition and that of others is not, within cer¬
tain limits and in itself, a source of great dissatisfaction.” Further, the only
thing that holds this exemplary group together is each man’s recognition
that it is to his self-interest to live and work with the others. In imagining
how their minds will work, therefore, we do not have to concern ourselves
with any collective sentiments or loyalties that transcend their individual
interests. This frees us, presumably, from having to take into account any
special anthropological considerations that will upset the universal appli¬
cability of our conclusions.
No doubt, this is not a group of men who invite love or admiration.
They are figures in a Moliere comedy, and not in any serious human trans¬
action. Their “rationality” is perversely simple, ascetic almost to the point
of lunacy. But these imaginary heroes nevertheless serve the purposes of
our analytic model. Since they are so free from normal human complexes
and complexities, it is easier to project their thinking processes. And we
cannot question that, in one traditional and fairly well-established use of
the term, they are “rational” men.
Now how would such a group adjudicate competing views about what
each man deserved or about the justice of their social system? If we can
say how they would deal with this problem, we would have, presumably,
at least the beginning of a rational concept of justice. And in no imme¬
diately obvious sense, it would appear, would we be making any special
moral presuppositions.
The first problem our heroes would notice when they tried to adjudicate
conflicting claims is that they needed some common principles of ad¬
judication. This, in turn, would require an agreed procedure for finding
these principles. For while each man would no doubt have suggestions to
make about the principles that should be adopted, the discussion would
degenerate into a cat fight unless it were governed by certain rules. Spe¬
cifically, it would be plain to each participant that he would have to agree
that any principle he suggested was to apply, if adopted, to complaints
CHARLES FRANKEL 403

against himself and not only to the complaints he made against others; it
would be plain as well that once a suggested principle were adopted, it would
remain in force, saving only very special contingencies, on all future oc¬
casions. In brief, all would recognize, given the conflict with which they
were faced, that impartiality and fairness were essential if there was to be
an uncoerced meeting of minds.
Once these rules of procedure were established—rules that would give
our heroes something like the restraints that go with subscribing to a
moral code—they would be in a position to work out a notion of justice.
And they would all agree, it can be argued, on three fundamental princi¬
ples:
(1) They would agree that each man was entitled to as much liberty as
was compatible with an equal liberty for all the others. For each is a free
and autonomous individual, acting in his own self-interest. The only reason
any one of them would give up any liberty would be that others did the
same and that this sacrifice was required by the group’s joint life and activ¬
ities.
(2) They would agree that only such inequalities in the distribution of
the benefits and burdens of their collective life could be justified as were
necessary to maintain practices that contributed to everybody’s net advan¬
tage. For no rational man would accept a less favored position than
someone else’s unless he was convinced that, in doing so, he was better off
than he would be if the inequality did not exist.
(3) Finally, they would agree that no one should occupy a more favored
position unless that position were open to a fair contest in which all were
free to take part. For the recognition that a given inequality actually
benefits everybody would not be a sufficient reason for a self-interested
rational man to accept the fact that someone else profited more from the
inequality than he did. He would also have to believe that it was better
for everyone, including himself, that the other man held the superior
position. That man should be the man who, as a result of an open com¬
petition, has demonstrated that he is best qualified.
Thus, the three constitutive principles of justice that would be adopted
by our imaginary group of rational men would be (1) the principle of
liberty, (2) the principle of equality, and (3) the principle of reward for
services contributing to the common good. Obviously, in any real-life state
of affairs, these principles would never be more than approximated. Never¬
theless, if the argument we have reconstructed is sound, any rational man
would recognize their validity as far as they go. We have thus presumably
arrived at a minimal but rational conception of social justice—one that
does not require any special or peculiar moral presuppositions but only a
calculation by each individual, no matter what society he occupies, of his
own self-interest.
404 3. THE CONSTRUCTION OF THE GOOD

II

Every reader will recognize the similarity of this argument to the tradi¬
tional arguments of social-contract theorists. They will recognize, too, how
strongly the argument responds to the desire, which for a hundred reasons
is so powerful today, to vindicate the belief that all individuals, whatever
the mores or codes of their societies may be, are the possessors of funda¬
mental rights. But does the argument succeed? Does it successfully prove
that justice is a concept that can be demonstrated on morally neutral
grounds, and that it is, in this sense, a “rational” ideal?
There are at least nine reasons for believing that it fails. While many
fewer than these would be sufficient to destroy the argument, the exam¬
ination of all of them will take us usefully into a number of the complex
questions related to the problem of justice.
(1) The argument for the principle of liberty rests on the crucial premise
that our imaginary group of rational men is composed of morally auton¬
omous individuals, each of whom speaks for himself alone, each of whom
possesses liberty, and each of whom values this liberty. Because this is so,
they come to the conclusion, not surprisingly, that liberty is a good thing
of which no one should be deprived without good reason. But this con¬
clusion is, of course, implicit in the premise of the argument, and the
premise is by no means one that there is any requirement to accept. We do
not have to suppose that all rational men either have liberty or prize it.
Indeed, we do not have to regard individuals as such as the ultimate
moral units. We can assign this ultimate status to families, clans, churches,
or corporations, and can ascribe value to their freedom and autonomy
rather than to the freedom and autonomy of individuals. Nor is there
anything forced or unusual in such a posture. If we look at the whole his¬
tory and range of human moral attitudes, the ascription of ultimate moral
autonomy to the group rather than to the individual is much more usual.2
(2) The principle of equality raises similar difficulties. No inequality
should be accepted, the argument asserts, unless the inequality can be
shown to contribute to everybody’s advantage. But what does this formula
mean?
We cannot mean by “everybody’s advantage” what any particular in¬
dividual thinks is to everybody’s advantage. This would entail the imposi¬
tion of that individual’s standards on everybody else. Neither can we mean
by “everybody’s advantage” what is determined to be to everybody’s ad¬
vantage by some independent and impersonal standard. This would require
us to find and justify such a standard, and takes us back to the problem
with which we started. And we cannot mean, finally, by “everybody’s ad¬
vantage” simply what each individual happens to think is to his own
CHARLES FRANKEL 405

advantage. Although this interpretation looks like the simplest and happiest,
it works out no better than the others. For the consequence of adopting
it is that any social arrangement that does not represent a bargain or balance
of interests satisfactory to all would have to be declared unjust. And this
would be to set a standard which it is difficult to think that any social
system could meet, and which would have no use at all where the most
troublesome social problems are concerned. For these problems normally
involve distributions or redistributions of benefits and burdens which in¬
evitably leave some people dissatisfied.
It is difficult to see, indeed, why there is even prima facie validity to the
idea that a given social practice is unjust unless each individual affected
finds it to his net advantage. It is quite common to call practices “just”
which are not in accord with this principle, e.g., Robin Hood’s stealing
from the rich to give to the poor, the denial of public funds to religious
institutions (or, conversely, tax exemptions for such institutions), and
some developing countries’ practice of taking property or positions away
from aliens to give to native citizens. Perhaps such practices are not just,
but some reasonable men at least argue that they are. Yet it would take a
very strained argument to show that any of them is to the net advantage
of all affected.
(3) The weakness in the argument is, in fact, even greater. By its very
nature, it gives us too little information to allow us to say just what any of
our hypothetical heroes should rationally accept as to their advantage.
Consider, for example, the process of thought that leads the members of
our group, at the very beginning, to accept rules requiring all to put for¬
ward their ideas in a neutral and judicial spirit. Presumably, the reason
they accept such restraints is that each wishes to avoid endless contention
and dispute. But might not such contention and dispute be to the advantage
of at least one of them? Might it not allow him, for example, to go on
taking advantage of the others while the debate is going on? Presumably,
again, our heroes accept these judicial rules of procedure because they wish
to avoid violence and solutions by force majeure. But might not the
strongest among them be willing to take his chance on such an outcome?
We must, in short, make the additional assumption, which has nothing to
do with the “rationality” of the individuals concerned, that no individual
or faction among them is significantly stronger or shrewder than the rest.
And even if we make this assumption, must we not also assume that our
heroes have certain pacifist inclinations? For could not some or all of them
actually enjoy violence and count a good battle, with all its risks, among
their most cherished interests? Obviously, we do not know if they would
or they would not. Our rational men are ciphers. Yet it is only if we pre¬
sume to know enough about these ciphers to exclude such possibilities
that we can be sure that they would all think it to their self-interest to
seek the path of peaceful adjudication.
406 3. THE CONSTRUCTION OF THE GOOD

(4) A still further difficulty lies in the fact' that the calculations of self-
interest which are actually demanded of our rational men are of two quite
different kinds. The tendency to confuse these two types of decision is
another fault of much social-contract theory. It is one thing to ask a man to
decide whether, in return for a specific and definite benefit, he is prepared
to make a specific and definite sacrifice. This is to ask him whether he
finds a given bargain advantageous. It is quite another thing to ask him
whether he is willing to run the risk of sacrifice in return for the possibility
of benefits. This is to ask him to take part not in a bargain whose costs
and benefits to himself he can predict, but in a game of chance in which
it is always possible that he will lose.
Consider rather commonplace decisions that must be made in any society.
Let us imagine, for example, that in our little club of rationalists the
proposal is made to build a bridge which, everyone in the club grants, is
vitally necessary if the group’s activities are to continue. However, it is also
plain that the building of the bridge is hazardous, and that it can be reason¬
ably predicted that some members of the society will die in the course of
the enterprise. What does it mean to say, under these circumstances, that
the building of the bridge will be to everybody’s advantage? Obviously, it
will not be to the advantage of those who die.
What we are actually saying, presumably, is that for a given group,
there is a percentage of risk and a percentage of benefit.3 In deciding
whether it is reasonable to take the risk, the individual can decide in his
public role as a citizen that as a statistical calculation the probable benefits
for the society outweigh the risks to some of its members. If, on the other
hand, we require the individual, for the purposes of our argument, to be
purely self-interested, he then has to ask himself whether he wishes to
apply to his individual case these statistical findings that apply to the group.
It is, of course, often reasonable to do this, in a quite normal sense of the
word “reasonable,” even though the application of a statistical regularity
to an individual case is not a demonstrative process. The individual’s deci¬
sion to bet on the probability depends, in the first place, on the value he
ascribes to what he will bet against what he might win, and, in the second
place, on the risks involved in his particular case.
There are, accordingly, two fatal flaws in the argument. The first is that
the individual has to face the possibility that the practice he is asked to
accept will not be to his advantage. It may still be reasonable, of course,
for him to accept the practice, but his acceptance will not be on the grounds
that the argument we are examining alleges are the only grounds for such
a decision. The second flaw, which is even more serious, is that our model
little “state of nature” gives us no help at all in determining what would
or would not be reasonable for any of our rational men to decide. For in
order to carry out such a calculation, we would have to know what specific
distribution of risks was in question, and what each man’s specific values,
CHARLES FRANKEL 407

skills, and social position were. Only then could we know what he stood
to gain and to lose, and what the risks were that he faced as an individual.
An elementary principle of politics and society, after all, is that social
costs fall differentially on different categories of citizens. In short, to say
what our rational individual would do, we would have to know who and
what he was. But by the nature of the argument we are considering, this
information is precisely what is excluded. Once again, we are dealing with
ciphers, and we are asked to do the impossible—to say how they would
calculate their advantage while we remain in ignorance of their outlooks,
their values, and their social positions.
(5) The empty character of the argument is further revealed when we
examine the reasons it offers why a group of men constituted as the hy¬
pothesis claims would reject the proposal to form a slave or caste system.
According to Professor Rawls, there are two possible points of view from
which they might consider such a proposal. The first would be one in
which they did not know their relative talents and abilities and so could not
say whether they would obtain high or low positions in a free and open
social competition. Such individuals, according to Rawls, would neverthe¬
less opt for an open system and against a caste system for two reasons.
The first is that the statistical chances are greater that they would end,
in a caste system, in lower positions rather than in the more restricted
higher ones. The second is that a fixed caste system is notoriously more in¬
efficient, so that the chances that one’s general well-being would be ad¬
vanced by choosing it are also smaller.
But these arguments are unpersuasive. For we cannot know from the
information given how the individuals concerned will actually weigh their
“advantage.” Perhaps, for example, they would prefer not to have the
burdens of high position and would like the guarantee a caste system offers
that they can live free from responsibility. Another possibility is that, for
all their desire for high position, they would not like the uncertainties and
stresses that go with competition for such a position and think it not worth
that much trouble. Indeed, from the point of view of his self-respect
(which is, after all, a strong element in self-interest), a rational man, cal¬
culating the probabilities, might very well come to the conclusion that he
would rather take a chance on living as a slave in a society in which he
could truthfully claim that his abilities were never fairly tested than risk
defeat in an open competitive society where such defeat would demonstrate
that he really was an inferior man!
As for the general “inefficiency” of a caste society, it is obvious that
everything depends on what goes into the notion of “efficiency. His¬
torically, rather a large number of people who have not been conspicuously
irrational, and who have occupied menial positions as well as privileged
ones, have counted among their blessings the stability promised by a fixed
system of social preference and the pleasures provided by the spectacle of
408 3. THE CONSTRUCTION OF THE GOOD

an ornamental leisured class. Just as it is possible, in this wicked world,


to want a class of priests who pray for all, so it is possible, in this work-
weary world, to want a class of aristocrats who play for all.
When we make a second and different assumption and look at the
proposal to institute a slave or caste system from the point of view of
people who know their relative talents and abilities, the case is equally
weak. It might appear, to be sure, that the abler would obviously be willing
to face a contest, knowing that they would win. And it might also appear
that the less able, even though they knew they would lose, would prefer
an open competitive system because it would at least ensure that those
on the top recognize that their favored positions were the marks of socially
useful talents and carried an obligation to serve the common good.
But there is, in fact, nothing automatic or infallibly just about either of
these inferences. For if the abler man really knows he is abler, why should
he risk a contest to prove it? Contests do not always turn out as they
should; accidents happen. And as for the great majority of the less able,
who know they are going to lose, why should they take the hard knocks
of a contest where the outcome is foreordained? The fact appears to be
that our model group of rational men have the rather special orientation
of members of a competitive society—otherwise the “rational” conclusions
to which they come would not be what they are.
(6) The argument is further open to attack on the more general ground
that “equality” and “inequality” are indeterminate terms. A meaning can
be assigned to these terms only when we specify the standard of com¬
parison to be employed, and the specific characteristics of the individuals
concerned which are to be compared. Men may be equal in height, skin
pigmentation, the prestige of a family pedigree, the obligation to give un¬
conditional obedience to a despot, and in innumerable other characteristics.
Some selection among these characteristics has to be made. In any socially
significant statement that men are equal or unequal, there is the implicit
decision that the particular characteristic in terms of which they are being
compared is a relevant and important characteristic (e.g., the ability to do a
specific job) while some other characteristic (e.g., skin color or family tree)
is not.
In other words, determinations of equality presuppose certain substantive
moral standards. And when men complain about inequalities, they may be
complaining not that a given standard which they accept is unfairly applied
but that the standards being used are the wrong standards. These are in fact
the complaints that generate the most important and difficult arguments
about “equality.” However, the model of justice which we are considering
offers no help with regard to such arguments. It appears to presuppose,
indeed, that they will not arise.
(7) When we turn to the concepts of “free and open competition” and
CHARLES FRANKEL 409

“reward for services to the common good,” we encounter similar problems.


The idea that a given distribution of burdens and benefits is fair when it
reflects the results of a free and open competition rests on the obvious
assumption that the individual’s merits are the proper basis for determining
what is due him. But merit is not the only available criterion. Individual
needs, for example, are also a relevant criterion, and are regularly so em¬
ployed—e.g., in special provisions for the protection of women, children,
and the aged. Many of the classic problems of distributive justice are intel¬
ligible only if we see that they involve weighing the claims of such different
criteria against one another.
Moreover, even if we take it as somehow self-evident that merit is the
sole criterion, there is no reason to assume that there is a necessary
connection between the results of a free and open competition and the
principle of merit or of reward for service to the common good. In the
first place, the merits that may enable a man to win a fair contest for a
post are not necessarily the merits that enable him to perform the duties
of that post best. In the second place, there are different types of contests,
all equally fair and open, that test quite different capacities and yield quite
different results. A man may complain, when he is required to play a certain
game, not that the game is played unfairly but that it is the wrong game,
that it rewards talents of the wrong kind and condemns a man with his
traits to the status of permanent loser. Professional football is not a game
for bantamweights, and the competition of the marketplace is not the com¬
petition that permits poets to demonstrate their worth.
This sort of dissatisfaction lies behind many kinds of social protests.
A formula that speaks only of “free and open competition,” therefore, is
radically inadequate. Choices have to be made between kinds of com¬
petition, and such choices require additional moral premises.
(8) The definition of “rationality” with which we have been working
loads the dice. Its moral bias is evident in its exclusion of envy from among
the traits of our rational men. According to Professor Rawls, the perception
that others are better off “is not, within certain limits and in itself, a source
of great dissatisfaction.” But the phrase “within certain limits” is clearly
relativistic. What one man or society may perceive as a minor inequality
may strike another man or society as a reason for revolution.
And beyond this, why is envy “irrational”? If we are taking a nonmoral
point if view, envy is simply one more human feeling, on a par with anger,
mother’s love, the pleasures of physical exertion, or the desire for revenge.
It is a genuine human emotion, and its satisfaction, if a man feels it,
is as much a part of his self-interest as the satisfaction of any other interest.
And from this point of view, if Peter is envious of Paul and we cut Paul
down to Peter’s size, thus making Peter feel better, we contribute to Peter’s
advantage.
410 3. THE CONSTRUCTION OF THE GOOD

This, no doubt, is intellectually inconvenient. It means that we have less


reason than ever to accept as intelligible a standard of justice which de¬
mands that all inequalities be to everybody’s advantage. Yet there is no
nonmoral reason for excluding envy from the traits of our imaginary
rational men. Nor is this an abstract point made only for the purposes of
debate. Envy, though not all men may feel it, is not an esoteric human
emotion. It is a fairly widespread and powerful one, and it is present in
many of the quarrels between men in which they call for “justice.” Nor
does its presence prove that what they demand is wrong. The government
of men depends on dealing with envy, not ignoring it. To exclude envy
from our abstract rational calculus may be a methodological convenience,
but it is also a moral prejudice and an unrealistic one.
(9) Finally, let us ask what is really proved by an argument which re¬
quires us to speak not of specific and determinate individuals who com¬
plain of specific injustices but of abstract individuals engaging in highly
abstract calculations of individual advantage. In Professor Rawls’ formu¬
lation, “It is a mistake to focus attention on the varying relative positions
and well-being of particular persons, who may be known to us by their
proper names. ... It is the system of institutions which is to be judged,
and judged from a general point of view. Unless one is prepared to criticize
the system of institutions from the standpoint of a representative man
holding some particular office, one has no complaint against it.”4
But a “representative man” about whom, by hypothesis, we know noth¬
ing else can have only the representative interests of the particular office
he holds. He is not distinguishable from his social role, and that role’s func¬
tion is defined by the very system of institutions we are supposed to be
criticizing. He cannot, therefore, have any interests but those that are
congruent with the system. An argument whose purpose is to vindicate
individual rights thus ends by dissolving the individual into his social role.
On this approach, needless to say, any social system will be found just.
The finding is tautological.
We thus come to a pervasive fault of our model argument. This is its
formalistic character—its effort to reach substantive moral conclusions
from premises that are morally neutral. Pulling such a rabbit out of the bag
is an old effort of philosophers. But this present effort to overcome ele¬
mentary rules of logic is no more successful than previous ones. And the
effort leads to paradoxical results. An argument that begins with the hope
of vindicating individual rights ends by ruling out of bounds any con¬
sideration of individuals who have proper names, specific and determinate
histories and concerns, and an identifiable individuality apart from their
representative social functions. And an argument that seeks to establish
an independent standard for judging all social systems ends with the
implicit conclusion that whatever is is right.
CHARLES FRANKEL 411

III

Is this the end of the matter? Is “justice,” then, merely a name for our
prejudices? What do reason and rationality have to do with it?
That we can connect rationality and justice still seems to me possible.
What makes it difficult is only the insistence, against all the rules of logic,
that we must find a way to extract a moral principle from premises that
contain no moral terms. But while philosophers have regularly tried to
perform this bit of magic, it is puzzling that anyone should think that a
moral principle like justice is somehow lifted in moral status when it
is shown to be derivable from neutral principles that are not moral at
all.
The problem is to try to be clearer about what we mean by “rationality”
in the actual context of decisions we make about justice. Since the argu¬
ment we have been considering is concerned, among other things, with
vindicating the idea of fundamental individual rights, let us consider another
kind of argument that can be given to vindicate them. It is the kind of
argument which is ordinarily given, I think, when thoughtful men with
different intellectual or moral outlooks pay joint attention to this question.
Briefly, the concept of justice which incorporates the notion of basic
individual rights emerged as an organizing principle to guide and justify
the lengthy historical process by which, in the modern era, individuals
were freed from irrevocable bonds to social groups or to their inherited
stations in life. The concept holds that all men and women should be de¬
fined, in custom, law, and morals, not simply in terms of their particular
social titles or functions but as legitimate claimants on other grounds to
certain sorts of equal treatment. It says that beyond the ways in which
society designates the position of an individual, and beyond the claims
it makes upon him, he has his own view of himself and his own aspirations
and potentialities, and that this view, too—a man’s own consciousness of
what he would be—has its legitimate claim to the attention and respect
of others. This appears to me to be a large part of what we mean by “the
equality of men.” And I take it that we mean to say, when we use this
phrase, not only that a man’s view of himself as a being larger than his social
roles should be respected, but also that something of value is lost, some
reasonable good is denied, if a man becomes or is turned by his environ¬
ment into, simply a passive creature of his position without this quality
of self-identification and self-consciousness.
Why should we adopt such an ideal? We do not have to. There are al¬
ternatives. We can insist, for example, that self-consciousness is disturbing,
that personal freedom of choice is dangerous, that individual rights represent
412 3. THE CONSTRUCTION OF THE GOOD

a claim that has no standing when compared with such overwhelming de¬
mands as national strength and solidarity, economic growth, or the protec¬
tion of religion and morality against irresponsible experimentation and
eccentricity. But if we take such a position, we are still confronted with
the results of historical developments in the West (and now in almost all
societies), which have disengaged individuals from fixed positions and made
them mobile beings confronted by an increasing variety of choices. Nor
can we undo the developments in science, industry, and communication that
have given to those who control the centers of economic and political power
the resources they now have to marshal and mobilize individuals.
Presumably, if we do not care for individual rights, this will not in itself
disturb us. But there are other probable consequences of choosing to or¬
ganize a modern industrial, urban society without regard to the principle
of individual rights—among them the manipulation of science, art, and
philosophy, the subjugation of religion and culture to reasons of state,
and the destruction of decencies in human relations that have been en¬
shrined not only in the doctrines of human rights but in other traditions of
our civilization. If we do not wish such consequences, and if we believe
that we are going to have to live with the broad conditions described, a
decision in favor of the doctrine of human rights as a central organizing
principle for society appears to be the best alternative open to us.
This is, it seems to me, the kind of argument which is ordinarily given
when people have doubts about the basic principle of human rights. It is
not, perhaps, peculiarly “philosophical,” but it appears to me to be a reason¬
able and objective argument, although it would no doubt not satisfy a man
from Mars. However, I see no reason why we should try to satisfy this
intruder into our human business. The argument presupposes that those
to whom it is addressed will find it compelling only because they are men
in a specific historical situation and have social memories, historical loyalties
and attachments, and personal interests and hopes. And yet this argument
is not a merely personal argument nor is it arbitrary. It appeals to values
that are not only my own, and to facts whose status as facts can be de¬
termined independently of anybody’s values. To call it “relativistic” or “non-
copiitive” is to say nothing more damaging than that it presupposes the
existence of moral concerns as a basis for moral discussion. But this
is surely the distinguishing characteristic of a moral argument, an in¬
dispensable element of such argument without which it has no force either
psychologically or logically. In this kind of argument we have a more
promising clue, I think, to the nature of “rationality” as it applies to
social ideals like justice or human rights than is provided by a priori
rationalistic arguments.
In the case of the more general concept of justice, the classic belief that
justice is inextricably linked to rationality reflects, I believe, three essential
ingredients of the idea of justice. In the first place, it calls attention to the
CHARLES FRANKEL 413

fact that the concept of acting in accordance with a rule is part of what we
mean by “justice.” Broadly, we can justify this emphasis on regular pro¬
cedure on the ground that security and predictability are necessary and
desirable in human affairs, even though this is a principle to which there
will be some obvious exceptions.
However, since this is a purely procedural concept of justice and does
not tell us whether the rules to which obedience is demanded are good or
bad, it is clearly insufficient by itself. At a second level, where questions
of substantive justice are involved, justice is related to rationality in the
sense that we may ask whether, as an empirical matter of fact, the rules
in question are likely to achieve the purposes for which they presumably
exist. To ask such a question is to ask whether the rules are “rational” in a
quite ordinary workaday sense of the term. (We may also ask, of course,
whether the purposes or ends they serve are “rational” or “reasonable”—
but that is another question.)
Finally, at a third level, the belief that “justice” and “rationality” are
connected calls attention to the relation between decisions about justice
and orderly processes of reasoning. “Justice,” whatever it means, cannot be
determined by caprice. But if one does not hold that there are universal
rational principles, independent of any particular system of rules and
postulates, can we give meaning to this third and most embracing notion
of the connection between justice and rationality? I believe that the answer
is “yes.” From an empirical point of view, the insistence that justice must be
“rational” points to the fact that any particular distribution of social burdens
and rewards is relative to specific and limited criteria which may themselves
be questioned; it requires us, therefore, to keep inquiry open, regarding
the status quo always as provisional. To wish to be “rational” with regard
to justice is to wish always to look for what has been neglected or left out;
it is to strive to achieve as comprehensive and tolerable a balance as is
possible among the competing claims that are at issue in a given context.
The connection between justice and rationality, from this point of view,
is a connection between specific judgments and a regulative ideal. It does
not imply that the idea of rationality is a premise from which a concept of
justice can be deduced.
Such a preconception, indeed, seems to me one of the more damaging
contributions of Western philosophy to Western civilization. It is responsible
for setting a will-o’-the-wisp as a goal of philosophical inquiry; it has caused
groundless anxieties about the status of our central ideals, and it has diverted
philosophers and others from dealing with the substantial and genuinely
difficult issues actually presented by moral and social choices. These do not
have to do with the logical demonstration of moral principles. They have
to do with the adjustment of these principles to one another when they con¬
flict, with their application to specific cases, and with their consequences
when we act upon them. The theory—and the implementation—of human
414 3. THE CONSTRUCTION OF THE GOOD

rights will be served better. I venture to think, by wrestling with such prob¬
lems rather than by abstract efforts to demonstrate human rights on a priori
grounds.

NOTES

1. The argument is drawn from the ingenious and carefully reasoned articles of
Professor John Rawls, particularly from his “Justice as Fairness,” Philosophical
Review, Vol. LXVII, (1958) and “Constitutional Liberty and the Concept of
Justice,” Justice, ed. by C. J. Friedrich and J. W. Chapman (New York:
Atherton Press, 1963). My interest, however, is not in a narrowly based debate with
Professor Rawls but in the broader issues which his arguments bring to the surface.
In order to keep these broader issues clearly visible, and in order, also, to avoid the
repeated use of the quotation mark, I have stated this argument, for the most part,
in my own language. I do not believe that I have departed from Professor Rawls’
version in any fundamental respect.
Professor Rawls’ arguments are representative of a broader current of thought.
They have been made not only by philosophers but by political scientists, often
accompanied by disclaimers of any attempt to make “unscientific” or “personal”
value judgments. See, for example, James W. Buchanan and Gordon Tullock,
The Calculus of Consent: Logical Foundations of Constitutional Democracy (Ann
Arbor: University of Michigan Press, 1962), which offers parallel arguments in the
context of an effort to develop analytic models for a theory of collective choice.
2; Professor Rawls agrees in the course of his arguments that in using the term
“individual” it is not necessary to mean by it a human individual in the biological
sense, but any unit, within a society, that acts as a moral agent. Thus, he writes, “A
more realistic conception of such a society might construe its persons as mutually
self-interested families or some other association.” (Friedrich and Chapman, eds.,
Justice p. 103.) But this concession seems to destroy the argument, for it becomes
compatible with situations in which no human individuals possess liberty on other
basic rights defining just treatment.
3. One of the purposes of Professor Rawl’s analysis is to show that the argument
for justice cannot be a purely utilitarian argument. If the above rejoinder has
validity, however, his own argument, in fundamental respects, is utilitarian
4. In Friedrich and Chapman, eds., Justice, p. 102.
BEYOND PARETO OPTIMALITY
James S. Coleman

All theories of action which endow the actors with purposes or goals
are principally designed to describe individual action. But if such theories
are to be generally useful, they should also be able to deal effectively with
collective action, in which the actor is a collectivity made up of two or more
individual actors, who may sometimes have conflicting goals. Yet such
theories have had difficulty in accounting for collective decisions and col¬
lective action. The contrast with individual decisions is instructive. In the
theory of rational individual behavior under certainty, an actor can order
different outcomes in terms of their utility for him. He will, by definition of
utility, choose the outcome that has highest utility, among those available
to him. In decision-making under uncertainty, where the outcome is only
partly determined by his action, a similar calculus applies, except that he
associates a subjective probability to each outcome under each action, as
well as a utility (with metric properties) to each outcome, and chooses the
action with the highest expected utility for him.
But a collective action, involving two or more such rational actors, is
not so directly treated. With two alternatives, say A and B, and only three
actors, there are 23, or eight possible patterns of preference.
These are:

AAA
A A B
ABA
BAA
ABB
BAB
B B A
B B B

For only two of these patterns of preference, all preferring alternative A,


or all preferring alternative B, will the classical theory of rational action be
relevant. If all prefer A, A will be chosen, and if all prefer B, B will be
chosen, but for none of the other six patterns is the theory applicable. For
the two patterns of complete consensus, the theory is applied by use of
the principle of Pareto optimality. Pareto optimality is in effect an extension
415
416 3. THE CONSTRUCTION OF THE GOOD

of the calculus of individual rational behavior under certainty to collective


decisons. It states that if one outcome is better than a second for one
member of the collectivity and at least as good for all others, then the
collectivity will be better off under the first outcome than under the second.
The first can be said to dominate the second. Further, all outcomes that
are not dominated by others constitute the Pareto-optimal set of outcomes,
and the theory states nothing about which of these outcomes will be chosen.
If there is only one Pareto-optimal outcome, then the theory is sufficient.
But if there is more than one, it is inadequate. In the case of two out¬
comes and three actors, if all three prefer A, then A is Pareto optimal and
dominates B. If all prefer B, it dominates A. But for the six patterns in
which there is disagreement, both outcomes are Pareto optimal, and the
theory is insufficient to predict a collective choice. Obviously, as the size
of the collectivity increases, the theory becomes less and less applicable.
Thus, although Pareto optimality provides a start toward the extension
of the theory of rational behavior to collective decisions, it is a rather
small start. In this paper, I want to continue this extension, toward an ulti¬
mate goal of full applicability of the theory of rational behavior to collective
decisions.
First, before proceding further, it is important to clarify just what is
meant by “better off” in the criterion of Pareto optimality. These problems
have in economics sometimes been treated within the area of welfare eco¬
nomics, and particularly in this area, it is sometimes unclear whether a
theory is normative or a positive theory about behavior. Specifically, who
is to decide when a person is better off under one policy than under another?
If the economist, sociologist, political scientist, or politician is to decide,
on the basis of objective conditions, that everyone is better off under one
policy than another, or that particular persons are better off and others are
worse off, then the theory is obviously a normative one, and the principal
task of the theorist is to determine criteria for judging whether people are
better off under one policy than another. But it has become clear in the
application of Pareto optimality and in welfare economics generally that
the principle of individual autonomy is the only valid way of determining
whether an individual is better off. That is, the concept of being “better off”
under one condition than another has no meaning other than a subjective
one: A person is better off under state A than state B when he himself be¬
lieves he is better off, that is, when he will choose A in preference to B.
That he might misjudge future states and thus bring himself into situations
where he is worse off is clear; but this does not negate the principle that
only he can be the judge of his welfare. Thus if the social scientist or
politician believes that a policy will make him better off, it is their responsi¬
bility to try to convince him of this, for his welfare is his own affair.1
This means that the application of the theory of rational behavior to
JAMES S. COLEMAN 417

collective decisions and the application of Pareto optimality and other


criteria must be concerned not with objective conditions but with individual
preferences and the expression of these preferences in choices, in voting
behavior and in other political action. Such an implication has not always
been clear, particularly to welfare economists. The confusion on this score
will be examined briefly later in this paper.
The recognition of this subjective nature of welfare occurs in the theory
of public finance at least beginning with Knut Wicksell, (1896, see Mus-
grave and Peacock, 1958), whose theory of taxation explicitly let the citizen
decide his costs and benefits in tax policy. More recently, in Kenneth
Arrow’s (1951) monograph on social choice and individual values, the
principle of individual autonomy or citizen sovereignty is incorporated as a
postulate of the system. And in the work of present theorists of public
finance (e.g., Buchanan 1967, Musgrave 1959), the importance of citizen
sovereignty as the ultimate criterion of the social value of a tax-benefit
scheme is explicit.

A REVIEW OF ATTEMPTS TO GO BEYOND PARETO


OPTIMALITY

Collective decisions can be conveniently divided into two types. One is


the decision between two or more specific alternative actions, such as the
decision between political candidates. The other is a decision whether or not
to carry out a particular action, such as the passage of a bill in the legisla¬
ture. The latter appears to be far the most frequent type, and we will limit
our discussion to it. From the point of view of the collectivity, such a de¬
cision is a decision to act or not to act. From the point of view of the
individual, it is ordinarily a comparison of the present and a specific
future, though some individuals may be using other possible future states
against which to judge this one. Thus if a collectivity is at a Pareto-optimal
point, there is no bill which can gain complete consensus, and no action
toward Pareto optimality can be taken. In any large collectivity, it is prob¬
ably true that any given point is Pareto optimal, so that if only moves
toward Pareto optimality were to be taken, there would be no action at
all.
In most collectivities, most collective decisions are carried out under de¬
cision rules that allow action under less than perfect consensus. These de¬
cision rules appear to be a compromise between the pure criterion of Pareto
optimality and the political reality that total consensus is almost never
realized. The decision rules for carrying out an action ordinarily range
from a majority rule to a unanimity rule. Often the degree of consensus
necessary for carrying out an action is a simple majority of those voting.
418 3. THE CONSTRUCTION OF THE GOOD

but in some cases, complete unanimity is required. The question of what


factors lead to a more or less stringent decision rule is an important one,
but one that will not be treated here.
In the attempt to go beyond Pareto optimality, the first point to strike
one is the intuitive feeling that an issue may be more important to some
persons than to others, and thus their expression should somehow count
more strongly. This intuitive feeling appears to have much justification in
everyday life; we frequently make a judgment of how important a course
of action is to another when we differ with him, before deciding how far
to press our own way. This feeling has also had expression in early work
in welfare economics, which assumed that the utilities of different individuals
for a given outcome could be added, and then the outcome with the highest
aggregate utility be the course of action taken.
Yet this intuitive feeling of “intensity” has been an extremely elusive
point to capture theoretically. It appears to lead immediately to interper¬
sonal comparisons of utility. Yet the individual, who is the only judge of
how important one thing is for him relative to something else, is a very
biased judge of how important something is to him, relative to its importance
to someone else. But neither is that other person a good judge. Ordinarily,
each has his own interest at heart; and for him, any sacrifice is greater than
whatever sacrifice the other might make. If, on the other hand, he strongly
identifies with the other, his judgment will be influenced by the strength of
that identification—again providing no judicious basis for deciding the
issue.
Because of the difficulties of capturing this notion of “intensity” of feel¬
ing, we will first examine attempts that have not tried to capture it directly.
Knut Wicksell is one of the earliest to attempt to soften the stringent
conditions of unanimous agreement while still staying within its confines.
He pointed out that the only insurance that a given tax policy (for it was
taxation policy that concerned him) would provide an increase in social
welfare is for it to be unanimously passed, consistent with Pareto optimality.
That is, only if all persons favored the tax could there be certainty that
the tax would produce more benefits than costs. For, from the same gen¬
eral viewpoint as Pareto optimality, if all favored the tax, then it was
certain to provide an increase in welfare, while if even one person opposed
it, who could say that the costs it incurred for him were outweighed by
the benefits it provided for others?
Obviously, such an extreme proposal as total consensus can hardly be
reached. But Wicksell proposed two procedures to facilitate its achieve¬
ment. The first is always to associate a given tax with the benefits it was
designed to provide, so that both would be voted on together. Thus each
person could balance in his mind the benefits he would enjoy by virtue
of the tax versus the costs he would incur. Wicksell’s second point was
then to allow infinite adjustments in the tax-and-welfare proposal. If one
JAMES S. COLEMAN 419

group felt that the costs outweighed the benefits to them, and thus voted
against a proposal, a second proposal would be devised that would be
acceptable to them. The same procedure would be followed until all parties
were satisfied that the benefits outweighed the costs for them.
This procedure appears, however, to have several flaws. Wicksell as¬
sumed that each person would compare his costs and benefits in deciding
whether to vote for the proposal. But there is nothing to prevent his use
of a different frame of reference. Suppose one tax-benefit plan, plan X, is
proposed, and it satisfies individual A but not B. Then in order to satisfy
B, another, plan Y, is proposed, more favorable to B. At this point Wick¬
sell assumed that A would still compare plan Y to his present state and,
if it were better, would vote for it. But having made a tentative move to
plan X, A might compare plan Y to plan X and, finding it not better
for him, vote against it. More generally, the individuals might use strategy.
If one party, though he feels his benefits outweigh his costs, believes he
can receive more benefits or pay fewer costs by holding out, he may well
do so. Furthermore, it is those who have least to lose, for whom the excess
of benefits over costs is least, who can most afford the chance of no policy
at all and thus most afford to veto any plan. This use of strategy could well
stymie legislation so that none would be passed, and all would lose.
A second flaw is that the principle is only applicable to certain kinds of
issues. It is applicable only to issues in which quantitative adjustments can
be made for specific individuals or groups, and only to policies which
provide some specific costs and benefits to all persons—excluding policies
that provide benefits only to some (e.g., the unemployed or dependent
mothers or parents of school children) and policies where the benefits
are not so easily defined.
If Wicksell’s proposal were viable, it would provide a means of finding a
Pareto-optimal move if one exists. If we assume that adjustments in the
distribution of the tax could be made with infinite precision (presumably
taking infinite time to do so), then a positive judgment could be made in
every case: If, for a given benefit scheme, some tax schedule could not pass
unanimously, we could definitely say that the costs outweighed or equaled
the benefits, and thus the present state is equal to or better than any possi¬
ble proposed one, involving this set of benefits.
However, this approach of Wicksell’s is concerned with a particular kind
of decision: a tax policy, in which many variations in policy are possible. A
major element in the proposal is adjustment of the tax policy until it becomes
acceptable. However, many issues are not like this. The collective decision
will be either yes or no, with few, if any, adjustments possible. The question
becomes acceptance or rejection, with little hope of so modifying the policy
that it might be passed. A declaration of war, passage of a civil rights bill, a
decision to build a county courthouse, all of these are concrete proposals that
can be modified only slightly if at all. Thus, for such issues, the possibility of
420 3. THE CONSTRUCTION OF THE GOOD

a resolution by modifying the bill until it becomes acceptable does not exist
or is greatly reduced.
A somewhat different approach was taken by some early economists
toward surmounting the constraints of Pareto optimality. Early writers,
along with assuming cardinal utility for an individual, assumed that utilities
for different individuals could be added, thus providing, in principle at
least, a sum total of utility for the population. If this were possible, it would
eliminate the need for the stringent Pareto test of individual-by-individual
increases in utility for a proposed policy, and allow a definite judgment of
better or worse than present for every proposed policy.
This easy assumption of interpersonal comparison was questioned at
some length, beginning in the 1930’s. Much confusion, which is not yet
wholly dispelled, developed. Perhaps the most important cause of confusion
lies in the confusion about the role of the economist himself and the role of
the subjects about whom he is theorizing. Classically, the role of the econ¬
omist in welfare economics was as policy adviser. Thus, reflecting this
classical position, Roy Harrod (1938) says:

Consider the repeal of the Corn Laws. This tended to reduce the value of a
specific factor of production—land. It can no doubt be shown that the gain
to the community as a whole exceeded the loss to the landlords—but only if
individuals are treated in some sense as equal. Otherwise how can the loss to
some and that there was a loss can hardly be denied—be compared with the
general gain? If the incomparability of utility to different individuals is strictly
pressed, not only are the prescriptions of the welfare school ruled out, but all
prescriptions whatever. The economist as an adviser is completely stultified,
and, unless his speculations be regarded as of paramount aesthetic value, he had
better be suppressed completely. No; some sort of postulate of equality has to
be assumed.

Harrod recognizes that he as an economist must explicitly be making


some assumption about the relative weights or importance of different
persons’ happiness, and he is willing to apply equal weights, at least in
policies involving restriction of competition. Lionel Robbins (1938), on
the other hand (Harrod’s principal antagonist in this argument), is not.
He replies, reflecting upon the evolution of his own views:

But, as time went on, things occurred which began to shake my belief in the
existence between so complete a continuity between politics and economic
analysis. ... I am not clear how these doubts first suggested themselves; but
I will remember how they were brought to a head by my reading somewhere—
I think in the work of Sir Henry Maine—the story of how an Indian official
had attempted to explain to a high-caste Brahmin the sanctions of the Benthamite
system. But that,” said the Brahmin, “cannot possibly be right—I am ten
times as capable of happiness as that untouchable over there.” I had no sym¬
pathy with the Brahmin. But I could not escape the conviction that if I chose
to regard men as equally capable of satisfaction and he to regard them as
differing according to a hierarchial schedule, the difference between us was not
one which could be resolved by the same method of demonstration as were
JAMES S. COLEMAN 421

available in other fields of social judgement. ... “I see no means,” Jevons had
said, “whereby such comparison can be accomplished.”
Harrod and Robbins both agree that interpersonal comparison involves
some judgment by the economist of the importance of different persons’
utilities; but Harrod, in order to continue to give policy advice, is willing
to make such judgments while Robbins is not. Presumably, most economic
theorists felt similarly uncomfortable, despite the egalitarian origins of
welfare economics, and thus were quick to follow a point suggested the
next year by Kaldor (1939): that in some cases, two policies could be
objectively compared without interpersonal comparison of utility, if a shift
from the first to the second brought wide enough benefits to some groups
that those who lost by the shift could be fully compensated by those who
gained. This compensation principle has similar attributes to Knut Wick-
sell’s taxation scheme, in which the bill would be continually adjusted until
all benefited from it. With the compensation principle, if compensation
were in fact carried out (a matter which was unclear in the proposal),
everyone would be better off under the second policy than the first, and
the move would be a Pareto-optimal move. This approach, which formed
the basis for further work by Hicks (1939), Scitovsky (1941), and Samuel-
son (1950), bypassed the question of whether the economist was to com¬
pare different persons’ utilities but maintained the role of the economist as
adviser concerning the welfare implications of economic policy. Finally,
Samuelson showed that it was highly improbable that there were any moves
that would pass sufficient tests to insure that the second position was
better than the first, and this approach came to appear less profitable. Yet
during this time there was little or no resolution of the question of the
economist’s role and the question of interpersonal comparison of utility.
I. M. D. Little, in his critique of welfare economics (1957, pp. 52-53)
shows the confusion most fully, for he retrogresses from Harrod’s and
Robbins’ recognition of the distinction between the economist’s and the
actor’s judgment.
We may now ask whether people do compare differences in satisfaction of
happiness, and whether they add them. As regards the first, there is no doubt
whatever. I can say “£.1 would make more difference to Smith than it would
to Jones.” Such statements are made, and when such a statement is made it is
correct to say that the man who makes the statement is comparing the differ¬
ences in satisfaction which the addition of £ 1 to his purchasing power would
make respectively to Smith and to Jones. . . . There is also no doubt that we
compare different people’s total happiness (where the word “total” must not
be taken to imply that happiness is a sum of parts). We frequently maintain
that A is happier than B, and it is obvious that when we do so we are not
talking nonsense. We can, also roughly, compare the amount by which A is
happier than B, with the amount by which B is happier than C.
Here it is unclear who is making the judgment: the economist, who is
outside the system of behavior to be explained, or another actor within the
422 3. THE CONSTRUCTION OF THE GOOD

system. As a consequence, Little adds only confusion in his discussion of


interpersonal comparison, but maintains the position that the welfare econ¬
omist’s role is to give policy advice, and that the fruits of welfare eco¬
nomics should be to advise “policy-makers” that policy X gives greater or
less welfare than the present state.
Meanwhile, a slow shift in interpretation of a social welfare function was
taking place. The concept had always implicitly been about the aggregate
welfare, whether measured individual-by-individual, as implied by Pareto
optimum, or measured through interpersonal comparisons of ability. But
the answer to the question of who was to do the aggregating and how it was
to be done was where the confusion lay. Kenneth Arrow’s book, which pur¬
ported to be about a “social welfare function,” discussed aggregating pro¬
cedures that were tantamount to voting rules. For this reason, many econ¬
omists refused to consider Arrow’s approach as relevant to welfare eco¬
nomics. That is, if Arrow’s approach had led to some set of acceptable
decision rules, it would have taken the welfare economist out of the role of
economic policy adviser, since the policy question would thus be put in the
hands of voters operating under an appropriate decision rule. The result of
this implication was that some welfare economists disavowed the idea that
a welfare function should necessarily have any relation to individual choice.
For example, Little (1952, p. 424), in a critique on Arrow’s work, writes:

. . . my interpretation of a Bergson [social welfare] function requires only that


there should be an order. It does not require that it should be an order such
that anyone would want to say of it that it represented the choices of society.

Again, unwilling to give up the role of economist as judge of what con¬


stitutes welfare, Little (1952, p. 426) says, in commenting on Arrow’s
condition of nondictatorship:2

Let there be three men and two alternatives, x and y. Let the orders be xy
for Tom and yx for both Dick and Harry. The [Arrow] conditions then pre¬
clude xy as the master-order. The two [economic] states may be such that in y
Tom has one piece of manna, while Dick and Harry both have ninety-nine
pieces; in x Tom has three pieces and Dick and Harry both have ninety-eight
pieces. Might not then the master ranking xy be desirable?

To this we can only answer that obviously the master-ranking xy would be


desirable to Tom and to I. M. D. Little but certainly not to Dick and Harry.
This passage shows particularly well the peculiar assumption of the welfare
economist: that there is a God somewhere, standing outside society, in
whose eyes something is better or worse for society; and it is the economist’s
role, like that of the Pope, to be the expositor of that God’s views to the
temporal rulers. Indeed, Little at one point invokes someone whom he calls
“superman” to carry out such judging. At this point it should be mentioned
that the assumptions of early welfare economists, such as Pigou and Berg¬
son, were that utility was maximized when the marginal utility of money was
JAMES S. COLEMAN 423

the same for all, and that this occurred when the amount of money income
(or among some authors, “real” income) was the same for all. Thus welfare
economics has always had an egalitarian bias, often implicitly assuming that
social welfare is maximized when incomes are equal but not being quite
willing to back up this assumption with policy advice to equalize incomes.3
Even at present, as in Little’s review (1957), a redistribution is called good
if it is toward equality, and bad if it is toward inequality.
Though Arrow’s book and the large volume of work spawned by this
work dealt with voting rules, most of these authors did not envision the loss
of the role of the welfare economist as policy adviser (speaking, of course,
in behalf of the welfare of the people). Yet the implications of Pareto
optimality, and the implications of Kaldor’s compensation principle, implied
a criterion of a unanimous vote as the welfare test of a new policy.
Furthermore, as indicated earlier, in a related but different tradition of
economic theory, the theory of public finance, the principle of individual
choice had long before been carried to its logical conclusion in the explicit
discussion of voting procedures by Wicksell and Lindahl. Only very recently,
in the work of Buchanan (1962) and others, has there been an explicit
recognition of a role for the welfare economist other than that of policy
adviser, a role of showing through theory the indirect consequences of a
policy—showing what the effects of a given action will be for each group in
the population and attempting to find policies that will meet with collective
agreement.
One of the disturbing aspects of the discussion of interpersonal com¬
parison of utility by most authors is a failure to recognize the basis on which
a concept in a theory may legitimately be endowed with certain properties,
whether these properties be the ordinal, interval, or cardinal properties of
real numbers. It is one of the earlier achievements of economics that the
appropriate criteria have been applied in choice under certainty, with the
result that an ordinal conception of utility has replaced the cardinal con¬
ception of Marshall. The appropriate criteria can be stated in several ways,
but in essence, they are: Considering the behavior which the theory is
designed to describe, does it include any behavior which may be used to
assign values and to test properties (ordinality, cardinality, etc.) that these
values are assumed to obey? In economic behavor under certainty, which
classical economic theory is designed to cover, the choice behavior of in¬
dividuals does not have the properties that allowed a cardinal measure of
utility but only properties that allowed an ordinal measure.4
Yet no economists appear to have explicitly applied a similar test to the
question of the interpersonal comparison of utility, and some have not even
applied it implicitly. As a result the most bizarre reasons are given for ac¬
cepting or refusing to accept interpersonal comparison. Harrod accepted
them in order to get on with his task of giving policy advice. Robbins im¬
plicitly applied the criterion and rejected interpersonal comparisons because
424 3. THE CONSTRUCTION OF THE GOOD

disagreements could not . . be resolved by the same methods of demon¬


stration as were available in other fields of social judgement.” Jevons was
closest to an explicit application of the correct criterion when he said: “I
see no means whereby such comparison can be accomplished.” The deteri¬
oration of thought among some present economists is evident in the work of
Little, referred to above, who says that because people make interpersonal
comparsons (not that they take economic actions on the basis of them),
such comparisons have a place in economic theory.5
If we explicitly apply the criterion and ask whether there is any behavior
among the actions that allows the calibration and test of a utility that can be
compared interpersonally, then we have two possible answers:
(1) We can accept the role of the welfare economist as policy adviser
who judges whether a policy is better than another, in which case the in¬
dividuals who are the subjects of his advice carry out no actions that will
allow interpersonal comparisons. Their interactions are limited to economic
exchange, and in those interactions the only comparisons are intrapersonal:
the relative utilities to one person of different goods. If we make this answer,
then interpersonal comparisons cannot legitimately be made, and the econ¬
omist is forced to restrict his advice to those moves that are Pareto optimal,
withholding judgment on all those where one or more persons would suffer
by the move.
(2) We can reject the role of the economist as policy adviser, and state
that the policy should derive from the behavior of the individuals in the so¬
ciety itself through a decision process. In this case, since decisions are made
and accepted, often with extended struggles and conflict, then we must
recognize that interactions do occur other than those of exchange, and that
these interactions make manifest some sort of interpersonal comparisons.
If we accept this answer, it implies the extension of the subject matter of
economics into political behavior as well, for it is this political behavior that
constitutes interpersonal comparison. Furthermore, this interpersonal com¬
parison is not the genteel passive concept that is implicit in economists’
definitions of utility, but includes intrinsic to it a notion of political power,
that is, the power to realize one’s goals. Specifically, the idea of interpersonal
comparison of utility implies two comparisons. First, since the very concept
of utility of a good implies the willingness to sacrifice something to obtain
the good, any interpersonal comparison must include a manifestation of this
willingness on his part to sacrifice something, proportional to the utility he
stands to gain or lose by the outcome. But it must include as well a test of
the relative efficacy of his resources and of those of his opponent in realizing
their goals. This sacrifice-and-test may take the form of conflict, or it may
take the form of compromise; but whatever form it takes, it is political be¬
havior and requires the economist to encompass that behavior within his
theory.
In exploring the implications of this second position, the most fruitful
JAMES S. COLEMAN 425

procedure is likely to be to return to an examination of the intuitive basis


which makes us feel that the importance or strength of feeling on an issue
does count in collective decisions. First, let us consider those situations in
which persons are not highly identified with one another. This starting point
is best justified by the fact that it begins from the rational-man premise of
economics. Of the rational man, all that can be assumed is that he will act
in his interest as he sees it. The special case in which he sees his own interest
to lie partly in another’s realization of his interests can be treated separately.
In a disagreement over an issue between two participants, our intuitive
judgment of the strength of each participant’s feeling lies in an assessment
of how far we feel each might go to get his way. This can be seen most
clearly in a sharp disagreement between two persons that could lead to a
fight. If we see that one party is “uncontrollable,” that is, that he would
stop at nothing to win, then we know that in fact he will be likely to win
because he is willing to employ all his resources to that end. However,
another point also arises: We must judge just what this party could do if he
was willing to stop at nothing. If he has no power to help or hurt the other,
then the other ordinarily need not worry, however uncontrollable the first
may appear to be. That is, if the second party has much more power than
the first, he can afford to pay no attention to this intensity of emotion.
In a disagreement of the sort described above, the resolution may occur
through the perception by each of the other’s intensity of interest and his
power to realize his interest. A husband coming home one night may see that
it is “very important” to his wife that they go to the movies. Knowing also
that she has the power to cause him great discomfort, he will go unless it
will discomfort him even more to go to the movies. This may be termed
a “virtual” confrontation, since neither party actually applies his resources
to gain his way. However, the resolution may also occur through an actual
confrontation in which each actually does apply his resources to the extent
that this issue is of interest to him.
Thus two attributes of each individual are relevant in determining its
resolution, given the existence of disagreement: first what resources he has
which give him the capacity to help or hurt the other; and second, how
willing he is to use these resources in order to gain this goal. Quite clearly,
if two persons disagree on a common cause of action and neither has any
capacity to help or hurt the other, then the resolution is indeterminate, no
matter how important the issue may be to one party or how little important
to the other.6
This intuitive assessment in the preceding paragraphs of the kinds of
interactions that occur when no mutually profitable action is possible (i.e.,
no Pareto-optimal move is possible) can serve as the basis for a more formal
discussion of how such issues are resolved and how some measure of the
critical variables, utility and power, can be carried out. The succeeding
sections will first examine how political processes make possible the meas-
426 3. THE CONSTRUCTION OF THE GOOD

urement of the relative intensities of preference associated with each issue


for a given individual; then the measurement of power and the transforma¬
tion of individual utility into social welfare; and finally, the conditions
under which an issue may come, like a good, to have a value, divorced from
the intensities of preference that each individual associates with it.

INTENSITIES OF PREFERENCE OR INTEREST IN AN EVENT

At the outset, several definitions and conventions must be established.


First, we shall be concerned in this section with the system composed of
two individuals, 1 and 2, and a set of events, 1, 2, . . . , n, over which these
individuals jointly have control. These events have two outcomes, a and b,
with a being maintenance of the present state and b change to another
state.7 The events are ordered so that events m + 1, . . . , n are those in
which 1 and 2 agree, either on outcome a, the present state, or on outcome
b, the new state. Events 1, . . . , j are those in which individual 1 favors
the present state, a, and individual 2 favors the new state, b. Events j + 1,
. . . , m are those in which individual 1 favors the new state, b, and indi¬
vidual 2 favors the present state, a. The decision rule is the one that requires
unanimity for a change. Thus as matters stand, there will be change only
on those events in which both favor b, that is, a subset of events m + 1,
. . . , n. Since this decision rule is one that requires a move to be Pareto-
optimal before it is made, it is conservative because it resolves all disagree¬
ments in favor of the status quo. (A symmetric decision rule, which decides
probabilistically in the case of disagreement, will be discussed later.)
With each outcome of each issue, each individual associates a utility,
which at the outset we do not assume to obey any special properties. For
individual 1 and outcome a of issue i the utility is uiai.
Given these definitions and conventions, what can we say about decisions?
(i) First we consider the state of the system resulting from a decision
procedure in which each individual casts his vote (for the present policy a
or the new policy b) on each issue, and then the outcome is determined by
tabulating all the votes and using the decision rule. From the knowledge that
individual 1 favors the present state and individual 2 favors a change for
events 1, . . . . , j, then we know that for all events i within this range,
uiai > ulbi, and uib2 > uia2. Similarly, for all events k within the range j,
. . . , m, this preference information tells us that ukbi > ukai, and uka2 > ukb2.
This information allows us merely to establish an order among the two out¬
comes of each event and not to compare utilities for outcomes from differ¬
ent events, much less to compare utilities of individual 1 with those of
individual 2.
We can further say that by this decision rule and these preferences, the
gains to society in the contested events are (uial, uia2) (for i = 1, . . . , m),
JAMES S. COLEMAN 427

and the losses are (uibi, uib2) (for i = 1, . . . , m). There is no way to know
whether the gains outweight the losses over different events for the same
individual or over the two individuals, for these quantities are incommen¬
surable, given the range of behavior allowed so far. We can only be certain
that the gains outweigh the losses for those events m + 1,. .., n, with which
we are no longer concerned.
(ii) Second, having carried out the preceding vote, the individuals are
now allowed to vote once again, this time being free to carry out prior
negotiations on any two events at once (though their negotiations must be
limited to two events at a time). In this case, suppose we find that on
events 1, . . . , g, and j + 1, . . . , j + g, the decision has been reversed,
and the new policy is voted by both individuals. Observation shows that
this occurred through a political exchange: Individual 1 offered to give up
his control over event i (1 ^ i ^ g), allowing a change to policy b, in return
for control over event k (j + l^k^j + g),in which 2 would allow a
change to policy b. What further information does this give us about the utili¬
ties of 1 and 2? Since we know that 1 gave up control over issue i for control
over issue k, this means that from the two changes in decision he has made
a net gain. Comparing his gains and losses from the change,

Event i Event k
gain: Uibi Utbl

loss: Uial Ukal


Net gain or loss: loss gain

This new behavior on his part, together with the previous information that
Uiai > uibi and ukbi > ukai, tells us that the gain he experiences for the new
policy on event k is greater than the loss on event i, or rather that:

Ukbi — Ukal > Uial — Uibi

More simply, the outcome of event k is more important to him than the
outcome of event i, or he has more interest in event k than i. For simplifica¬
tion of further treatment, we can give a precise definition of his interest in
an event. If we define yki = ukbi — ukal, then yki is negative if he favors the
status quo and positive if he favors a change. Then if we define xki = | yki |,
he will be willing to give up control of i to gain control of k if xki > xsl; and
the quantity xkl is his interest in event k.
This added freedom for the behavior of these two individuals gives a
partial ordering of the utility difference between outcomes for each, or his
interest in the event, that is, a partial ordering of the Xii and of the xi2. This
partial ordering is very incomplete, since it establishes order relations only
between pairs of x’s, i.e., those pairs of events on which a political exchange
was made. However, we can extend this partial order by slightly modifying
the way this new vote is made: by asking both individuals to state, for all
pairs of events, whether they would or would not be willing to make an
428 3. THE CONSTRUCTION OF THE GOOD

exchange. Thus if there are r events in the set (1, . . . , j), and s in the set
(j+1, both individuals would give an answer for all r x s pairs
of issues. Exchanges would then be made by matching pairs of issues on
which both individuals had responded positively. The responses generated
by this procedure would allow testing of transitivity of the x’s and would
make the partial order much more complete.8 There still would not be inter¬
personal comparison between the xiX’s and xi2’s, nor a metric associated with
them.
These outcomes also give some further information about the gains and
losses of these decisions to society. The general result is that comparing this
procedure, allowing political exchange, to the first which did not allow such
exchange, there is a net gain to society, as has often been claimed for in¬
creases in freedom of trade.9 This gain can be expressed by considering only
those events on which a change in outcome occurred between this pro¬
cedure and the more restrictive one, that is, events 1, . . . , g, and j + 1,
. . . , j + g. Let us now assume the events j + 1, . . . , j + g, ordered so
that j +1 is exchanged with event 1, j + 2 with event 2, and so on. The
end result of this exchange is to result in a gain of xj+iii and a loss of xu for
individual 1 on events j + 1 and 1, and a gain of x12 and a loss of xj+i,2 for
individual 2 on events j + 1 and 1. For both individuals there is an overall
gain (i.e., xj+itl > xn, and x12 > xj+ij2), as implied by the willingness of
each to exchange. Thus the gain to society from the new political procedure
can be expressed by the set of g pairs of gain-and-loss for individual 1, and
the set of g pairs of gain-and-loss for individual 2. For both individuals for
each of the g pairs, the gain is greater than the loss. Sums could be carried
out over the two sets separately (not assuming comparability between x’s
for the two individuals), but we are not yet qualified to assume additivity
for an individual’s x’s, but only an order among the x’s. Nor do we need to
assume additivity, since the gain exceeds the loss for each pair of events
taken separately.
(Hi) Now we allow an added freedom of behavior beyond that allowed
under (ii), that is, the freedom to carry out political exchanges in which
more than two events are involved. Individual 1 can offer to exchange his
vote on one or more events for 2’s vote on one or more events. If he offers
to exchange three for one, this means that the combined event (which we
will label d) consisting of several individual events (say i, k, and c) are of
less interest to 1 than the single event which he is attempting to gain control
over. We assume no interaction among events, by which we mean that we
assume l’s interest in this combined event, xdi, to be given by:

Xdl = Xii + Xki + Xci

This assumption is a strong one, for it implies properties of additivity


among l’s interests. But this is a property that can be tested in the behavior
JAMES S. COLEMAN 429

itself. For example, if 1 were indifferent between control over event c and
control over event h, then he should be indifferent between control over
combined event d and control over combined event e which consists of
events i, k, and h. This test examines the interaction between event c and
events i and k, by testing the proposition: If xhl = xci, then xdi = xei, where
d and e are combined events consisting of i, k, c, and of i, k, h, respectively.
Obviously many such individual tests must be carried out to provide a
reasonable degree of confirmation of the assumption of additivity or no
interaction. The important point, however, is that with this greater political
freedom for negotiation, behavior occurs which makes it possible to give
numerical values to the x’s, and to test whether these values have the prop¬
erty of additivity possessed by real numbers.10 The second important point
is that, as will be shown, this added political freedom will give social welfare
higher than or equal to that obtained under (ii).
To see this last point, assume first that all the single-event exchanges have
been made under condition (ii), and then the new possibility exists for
multi-event exchanges. Either no further negotiations would occur (in which
case the social welfare would remain the same), or some further exchange
would occur. We can see immediately that if such an exchange occurs, it
increases the welfare of each, for it would not otherwise be made. Thus the
revelant question is, can such an exchange be made, given that no single¬
event exchanges are possible? That this is possible can easily be shown.
Suppose individual 1 wanted to gain control over event h, and offered con¬
trol over either i or k for it. But the exchange did not occur because for
individual 2 just as for individual 1, 1, xh2 > x12, and xh2 > xk2. However,
if 1 is now allowed to offer control over both i and k for h, his offer means
that xhi > Xn + xkl, and if 2 accepts, this means that xh2 < xi2 + xk2. Both
of these inequalities are compatible with the previous single-event inequali¬
ties. Thus if this exchange occurs, the gain to 1 is xhl — xiX — xkl, and the
gain to 2 is xi2 + xk2 — xh2, both of which are positive.
Now it is possible to express the welfare to the two members under each
of the three conditions. Let us assume additional events, labeled g + 1, . . . ,
h, and j + g + 1, ... f are exchanged under condition (iii).11 The baseline
from which the welfare below is expressed is for each event the utility for
the outcome he likes least. By slight modifications of these expressions, a
baseline representing the status quo could be used.

(i) individual 1:
£ Xn +
i=m+1
Xn

individual 2:
£
i = j + i
X12 + i
i = m + 1
x,2
430 3. THE CONSTRUCTION OF THE GOOD

3 j + g
n
(ii) individual 1:
E Xi! +
E Xji +
E Xii

i = g + 1 i =j+ 1 i = m + 1

m n
g
individual 2:
E Xi2 4-
E xi2 + E xi2
i = 1 i = j + g + 1 i = m + 1

f n
j

(iii) individual 1:
E Xi! +
E Xi! +
E
i = m + 1
Xii

i = h + 1 i = j + 1

h m n

individual 2:
E x12 +
E xi2 +
E xi2
i = 1 i = f + 1 i = m + 1

It is not evident from the sums given above, but the inequalities presented
earlier show that at every step the changes expressed by changes in the
summations is going from (i) to (ii) and from (ii) to (iii) all represent
gains for both individuals. Thus at every step, there is an increase in the
welfare of both individuals, produced by the increase in freedom of political
exchange. Yet there is still no basis for comparing the utilities or interests or
welfare of the two individuals. It is to this we next turn.

INTERPERSONAL COMPARISON AND POWER

In condition (iii) above, each individual’s utility differences, or interests,


in an event were scaled on his own scale, so that the units are noncompar¬
able. Thus it may well be that the sum of the x’s for individual 2 is 100 times
the sum of the x’s for individual 1, with no consequence, since no behavior
exists in which their relative sizes are compared. Since the scales are arbi¬
trary, let us rescale each individual’s interests as a proportion of his total
interests in the events which make up this system of behavior (keeping the
same notation for the rescaled values). Thus the sum total of x^ for all n
events is 1.0, and the sum total of x12 for all n events is 1.0. This does not
imply comparability between these interests, for who can say that l’s total
interests are not far more (or less) important than those of 2?
We can, however, make one kind of comparison between these two:
Under each voting scheme, each individual realizes a certain proportion of
his total interests. Thus under (i), individual 1 realizes

23 Xii + 23 Xu
i = 1 i = m + 1

and 2 realizes
m n

23 xi2 + 23 Xi2
JAMES S. COLEMAN 431

each of these two quantities being some proportion less than or equal to 1,
since the sum of xn and of xi2 over the whole range are each equal to 1.0. For
the other conditions, the sums are as given above under (ii) and (iii). Thus
if each individual acted with perfect rationality, each must have realized that
part of his interest allowed by his power in the system, plus an additional part
due to the conjunction of interests between him and the other. In other words,
his power in this system of events can be defined as his ability to realize his
own interests. It is evident that his ability to realize his interests depends
both upon his “constitutional control” over particular events and upon his
distribution of interests. Thus under conditions of no exchange, it is evi¬
dent that the power of individual j is proportional to the sum of c^ xij5
summed over all events i, where Cjj is the degree of constitutional control of
event i by individual j. That is, under conditions of no exchange, he has
much power only if he controls those events that interest him greatly. He
cannot use his control of other events for any purpose whatsoever, so this
control does not add to his power. In the special case of a unanimity rule
(as in the case of any voting rule which is not symmetric between the two
outcomes of the event), his constitutional control c^ depends upon the di¬
rection of his interests. If his interests favor the status quo, cu = 1, while if
they favor the new policy, Cij = 0. Thus the measure of power for individual
j n
1 is proportional to ^ Xjj + ^ Xy. If the total power
i = 1 i = m

in the system as a whole is standardized to equal 1.0, then the power of 1


(labeled rx) is given by:

Xa + £ Xu
i = 1 i = m _

1 j m m n

S Xil + S Xil + S Xi2 + S Xi2


i = l i = 1 i = j + 1 ' = m

In calculating this power, the quantities Xji and xi2 have for the first time
been taken as commensurable. That is, the proportion of an individual s
interest that he realizes politically is commensurable with the proportion of
another individual’s interest that he realizes politically, for these are mea¬
sures of the political power of each. Thus it is only in the assessment of
relative power that the interests of 1 and 2 can be compared. And at this
point, they not only can be but must be compared, for the power of an
actor in a system is his ability to realize his interests.
This introduction of relative power as the only source of “interpersonal
comparison of utility” is so foreign to the usual ideas of interpersonal com¬
parison that some discussion is warranted. Consider how an interpersonal
comparison of utility could be made between husband and wife. When an
432 3. THE CONSTRUCTION OF THE GOOD

American Indian husband rides while his wife walks carrying a burden,
can we compare the utility or disutility that each is experiencing? If a
modern American husband takes the bus to work while his wife uses the
car, can we compare the utility differences that each would experience
between the two activities? In both cases, I think the answer is No. We can
observe only the outcome, and from that infer not how important it is to
one compared to its importance to the other but something quite different:
the power that each has to realize his interest in this arena of decision¬
making and the amount of interest this event has for him relative to other
things that also interest him. In the case of the American Indian it would
be necessary also to observe numerous choices of the husband and wife
separately between walking, riding, and other activities; and second, to
determine the relative power, it would be necessary to observe the outcomes
of other events involving activities of varying levels of importance to each.
Similarly, with the modern American husband and wife. Obviously, the
main difference between these two examples is not one of utility differences
but of differences in relative power of the husband and wife. There is no
standard for comparing these utilities other than the standard they (in their
larger social context) establish for themselves, and this standard is not one
of utility but relative power.
To see the dependence of power upon the decision rule, let us con¬
sider a different decision rule: a probabilistic rule with the probability of
outcome a equal to the proportion of votes for a. In this case, each individ¬
ual’s constitutional control of each event is 0.5, independent of his direction
of interest. Under condition (i) of no vote exchange, his power to realize
his interests will be, as above, proportional to the product of his control
over events and his interest in the event. Comparing this with the earlier
measure of power under the unanimity decision rule shows that the measure
is quite different under this new decision rule. If the same decision rule
existed, but individual 1 had three votes and 2 had only one, then l’s ability
n

to realize his interest on issues 1, . . . , n would increase to E .75 xu,


i = 1
and his power would increase to

E
i= 1
-75 xii
r! = -
n n

E -75 XU + .25 Xi2


i= 1 i= 1

fl = 0.75
JAMES S. COLEMAN 433

It is evident that the total realized interests in the system can be summed
in such a system as this, for they must be summed over persons to provide
the denominator for the measurement of power. This means also that it is
quite possible to obtain, with this sum, the total “social welfare,” if we are
quite explicit about what we mean by social welfare. This social welfare
that results from these collective decisions is social welfare in which differ¬
ent individuals’ interests may count the same or differently—not as a func¬
tion of an omniscient God or his representative, a social scientist, deciding
that each individuals’ interests should count a certain amount in the social
welfare, but as a result of the decision rule, the procedural rules, and the
distribution of interests. Thus if the constitutionally established decision
rule requires unanimity, it gives greater power to those who favor the
status quo, and the aggregate social welfare can only be stated relative to
this constitutional inequality. The actual distribution of interests realized
gives the distribution of welfare. As a consequence, two statements need to
be made in stating the social welfare arising in a given system of collective
decisions: (a) the sum of realized interests; and (b) the decision rule and
procedural rules, which together with the distribution of interests determine
the relative power of each individual to realize his interests. An alternative
to the second statement is the actual distribution of power, or some meas¬
ure of the skewness of the distribution.12
Thus the interests of some may count more than the interests of others in
the social welfare as actually realized. Why is this so; why not rather that
the interests of all count equally? The answer lies wholly in the constitution
which dictates the decision rule that distributes control over decisions and
gives the permissible political procedures.13 One can then object, however,
that we have not progressed at all but instead have merely pushed the ques¬
tion up another level, with the question becoming Why does the constitu¬
tion provide this degree of inequality (or equality)? The objection is a valid
one, but betrays a mistaken notion that there can be something outside so¬
ciety in terms of which its power distribution can ultimately be justified.
Because a society is a self-governing system, the only explanation is an infi¬
nite regress, moving at every point closer to fundamental sources of power,
such as economic productivity, control of strategic resources, intelligence,
and physical strength.
Yet to have pushed the question up one level may provide a very valuable
service. It shows what the indirect consequences of a given constitutional
procedure are, and thus allows those who have power over the constitu¬
tion (which almost always is more nearly the total of the members of the
electorate than is true for specific decisions) to compare the consequences
with constitutional intent. Thus it allows for a more rational framing of the
constitution, to allocate power among members of society more precisely in
accord with the intent of those with power over the constitution.
434 3. THE CONSTRUCTION OF THE GOOD

POWER UNDER CONDITIONS OF FREE EXCHANGE

The definition of power above is only appropriate for the case in which
each individual is restricted to merely casting his vote. But under conditions
of free exchange (such as [ii] or [iii] above), each individual has additional
actions he can carry out. These actions increase his ability to get what he
wants, for they allow him to use the interests others have in an event over
which he has total or partial control. That is, if individual 1 has control over
events 1, . . . , j because he favors the status quo, then his ability to realize
his own interests in other events depends upon individual 2’s interest in
these events 1, . . . , j. If 2 has little interest in these events, then he is un¬
interested in an exchange which will allow him to gain control of them.
However, if 2 has much interest in these events, he will give up control over
events of interest to 1 in order to gain control of them. Thus we might be
tempted to say that under conditions of free exchange, an individual’s
ability to realize his interest (that is, his power) is equal to his degree of
control over events times the interest that all individuals have in those events.
If the interest is his own, he can realize it directly through exercising his
control; if it is another’s, he can realize an equivalent interest by exchang¬
ing his control over this event with the interested party for one in which he
does have high interest. However, such a formulation neglects one point,
which can best be illustrated by example. Suppose individual 1 has control
over event k, which individual 2 has high interest in, but 2 has no control
over an event of interest to 1. In such a case, the interest that 2 has in event
k is useless to 1 in exchange, because of the lack of power of 2. The case
becomes even clearer if there are three persons in the system. If 1 controls
event k of interest to 2, and event j, of interest to 3, then which is more val¬
uable to him? It depends on the relative power of 2 and 3. If 3 controls
every event not controlled by 1, then 2’s interests in event k is worthless to 1.
Then taking these two points together, in a system of free exchange
(assuming a large number of events so that terms of exchange could be
appropriately worked out by combinations of events), an individual’s power
or ability to realize his interest is given by his constitutional control over
an event times the interest of individuals in this event, each individual’s
interest modified by his power; this whole quantity summed over all events:14

n s

Tj = X) Cij Xlk Tk
i = 1 k = 1

where cu is the proportion of total constitutional control held by individual


j over event i and xik is interest as defined earlier.
This means that in a system of free exchange with no constraints on
JAMES S. COLEMAN 435

political exchange, an individual’s ability to realize his interests is pro¬


portional to his control over interests (modified by power), no matter how
far removed from him those who have interests in the event he controls are.
If there is restriction upon exchange, then his ability to realize his interests
lies only in his control over events of interest to him. Obviously, then, with
exchange there is an expansion of the individuals’ ability to realize their own
interests, because each can further his own interest through exchanges of
control. The actual amount of interest that each can reach depends, of
course, upon the distribution of interests for and against new policy. If
these interests are rather evenly divided, then obviously there will be little
excess gain over loss, while highly skewed interests will result in a great
excess of interests realized.
Because each individual can use the interests of others in an event over
which he has control, control of a given event tends to have a certain value
in exchange. This value is simply the sum of the interests of all individuals
in the event, with the interest of each modified by his power in the system.
That is, the value of event i, Vj is given by:

vi= Xijfj

j=i

Beyond these definitions, it is possible to say something about the maxi¬


mization of social welfare, given three initial conditions: (a) the decision
rule in use; (b) the matrix of interests of individuals in events; and (c) the
matrix of constitutional control of individuals over events. Given these
initial conditions, social welfare (defined as relative to the distribution of
constitutional control, as indicated earlier) is maximized under conditions
of absolutely free and frictionless political exchange with full information.15
In seeing why this is so, it is first useful to consider why social welfare is
not automatically maximized under a majority voting rule. The reason it
is not is that the minority might in fact feel more strongly about this event,
and even given equality of constitutional control, lose to a majority which
felt only mildly about the event. This is the classical problem of majority
rule.16
The conditions of absolutely free political exchange mean that for each
individual the interest of another in an event that he controls is as valuable
to him in exchange as is his own interest (each interest of course, modified
by the individual’s power). Thus consider a system of free exchange, in
which tentative exchanges are made. (In exchange of private goods, Walras
has described such a system of tentative exchanges in discussing the de¬
velopment of a price in markets.) The exchanges will be of the form in
which an individual or group increases its control over one or more events,
while another individual or group increases its control over another event
or events. Now first, we observe that each party to the exchange increased
436 3. THE CONSTRUCTION OF THE GOOD

by the exchange its total expected utility from the system; if it did not, it
would not have entered into the exchange. Thus the only possible losses
incurred through the exchange are due to externalities—losses experienced
by those who were not parties to the exchange. Only if their expected losses
(weighted by their power) are greater than the expected gains of the parties
to the exchange and others who stand to benefit from the exchange will there
be a reduction in social welfare due to the exchange. But if this is so, it
means that they have the resources to carry out an exchange with one or
both of the parties, overriding the initial tentative exchange. For each event
i involved in the exchange, each party j will benefit or lose by the exchange,
exchange a quantity A piyy, where A Pi is the change in probability toward
passage of the new policy due to the tentative exchange on event i, and
yfj is the utility difference in favor of the new policy. The sum total of wel¬
fare gained and lost by the exchange is thus obtained by adding Apiyij over
all events i for the individual, then weighting this sum by the individual’s
power r_j, and finally summing all the positive numbers and the negative
ones. If this negative sum is greater, more has been lost by the exchange
than gained by it. But since rj is the total value of events under the control
of j (Tj = 2 Cij vk), and the absolute value of ytj is xu, then the measure of
social welfare lost is also a measure of the sum of the resources which the
losing parties j can afford to give up to redress the loss. The measure of
resources that a given losing party can afford to give up is that fraction of
his interest that he stands to lose, ? Apiyij, times the total value under his
n

control, ^ ckjvk. Since this value is exchangeable on the market for an


k= 1

equivalent value, then if the gain achieved by those who gained in the
tentative exchange was less than 22 a p* Xj rj (summed only over those who

lose by the exchange), as hypothesized, the losing parties can arrange an


exchange of greater value. Admittedly such an exchange might be difficult
to arrange, but in a system with a sufficient number of events, a sufficient
number of individuals, frictionless exchange, and sufficient time, a stable
set of exchanges could be concluded that would lead each event to be re¬
solved in the direction that maximizes social welfare. Thus with perfect
exchange, for each of the set of events, the collective decisions will be
decided in that way which maximizes social welfare under the given con¬
stitution.

NOTES
1. Sometimes politicians forget this and assume that an action is clearly ben¬
eficial to all, later to discover to their dismay that this is not so at all. An excellent
case in point is the behavior of many city councils in passage of a measure fluoridat¬
ing the water system. The cost is almost nothing (it does not raise the tax rate),
JAMES S. COLEMAN 437

and the benefits would seem to be widespread, with no harm to anyone—clearly


a Pareto-optimal move. But in many cases, popular reaction showed that it was
far from Pareto optimal, and in many communities a referendum overturned a
previous decision of the city council to fluoridate. This has happened with such
frequency in some states that a state law has been passed requiring a popular
referendum before fluoridating the water system.
2. Arrow’s condition of nondictatorship can be seen as closely related to the
criterion of Pareto optimality. The Pareto criterion says that not even all other
members of the collectivity together can dictate to one; Arrow’s condition says that
no one member of the collectivity can dictate to the others.
3. In this connection it is interesting to note that a socialist sociologist from
Czechoslovakia at the International Congress of Sociology in 1966 discussed with
approval a current program in his country for de-equalization of incomes, arguing
that incomes of different occupations were too equal for the good of the economy.
4. This means, in effect, that there was nothing in economic theory that predicted
the utility of a combination of goods from the utilities of the component goods. If
there had been, cardinal utility would have been implied by the theory. One can
ordinarily recognize whether a theory implies ordinal measures by the existence of
a comparison operation in the individual behavior that allows prediction of further
comparisons under the assumption of transitivity. One can recognize cardinal mea¬
surement if the value assigned to an object resulting from the combination of a
number of objects can be predicted by adding or averaging the values assigned to
the individual objects that make up the combined object.
5. See especially Little, op. cit. (1957), pp. 52-66. One passage especially is of
interest. “No one could ‘deny’ interpersonal comparisons in the sense that they deny
that people make them. Therefore, those economists who ‘deny’ them must think
that when a person says ‘A is happier than B’ he is deluding himself that he is
making a statement of fact. . . . Therefore those who ‘deny’ interpersonal com¬
parisons must deny the existence of other minds,” pp. 54-55.
6. If there are, as is often the case, three alternatives (one favored by A, one
favored by B, and the absence of common action at all), then a simple extension
of rationality would lead to a solution. Suppose A prefers the action favored by B
to the absence of common action. If so, he will be willing to accept the action favored
by B, if the alternative is no common action. If at the same time, B prefers the
absence of common action to the action favored by A, he will not accept that
action. Thus the action favored by B would win. This is an instance of the “principle
of least interest,” which states that the person least interested in the maintenance of a
relationship can have most power in it.
7. There is, in the succeeding treatment, an asymmetry of a and b, introduced
in order to allow a decision rule that allows only Pareto-optimal moves. However,
a different decision rule would remove the asymmetry without affecting the present
derivation. Consequently, the same derivation would hold if a and b were not
distinguished in any way.
8. The partial order takes the form of a cumulative scale, constructed like
a Guttman scale in attitude scaling, where either set of x’s is taken as the items
and the other as the individuals. If a symmetric decision rule such as a 50-50
chance of a and b had been chosen for the case of disagreement, a complete order
could be established for the xu’s and x^’s for event 1, . . . , m, by asking both
individuals whether or not they would be willing to give up issue i for j, for all
pairs fi, j). Alternatively, if we carry out a test for transitivity with a few pairs,
the overall task would be greatly simplified by having both persons order the issues
in terms of their interest.
9. The reason that a definite statement cannot ordinarily be made about the
social welfare change due to increased freedom of trade or increased freedom for
political negotiation is that externalities are involved, that is, effects on individuals
who do not participate in the trade. If there were three persons making this
collective decision, then an agreement between two of them might well harm the
third. If exchanges are allowed between political and economic resources, there
is even more room for externalities. In politics, for example, freedom for legislators
to accept money for votes would have negative effects for those who are not party
to the exchange, and more generally for those who do not have money to exchange
for votes. ,
10. There could be a number of other methods used to assign numbers to the xs,
depending on part of the decision rules used. The use of decision rules which gave a
decision favoring individual 1 with varying probabilities could be used to assign
numbers in conjunction with a measure of probability. More generally, it is possible
438 3. THE CONSTRUCTION OF THE GOOD

to use any one of several techniques for measuring utility under risk (e.g., Von
Neumann-Morgenstern (1947) or Ramsey-Suppes (1955)). This procedure however,
might not give an additive measure even if the suggested procedure did, for it
would not only involve combinations of issues, but also risk, meaning that he would
make exchanges as if his interest in this issue were A pi xu, where A pi is the
increment in subjective probability of a favorable outcome, given the exchange.
11. If a different procedure has been assumed, starting condition (iii) after
initial preferences were established, rather than after single-event exchanges, then
a higher amount of welfare might have resulted for each person, since some of the
single-event exchanges made under (ii) may have precluded combinations that
would have resulted in a higher welfare for each.
12. The standard deviation of the distribution appears to be an especially good
measure of this inequality, for it is equivalent to a function of the differences in power
between all pairs of individuals. It is the square root of the average squared difference
of power between all pairs of individuals in the system.
13. Buchanan and Tullock (1962) have emphasized the importance of separating
problems of collective decisions into two levels, one a constitutional decision which
establishes the procedures under which the operating decisions are made.
14. This system of equations can be solved for rj by transposing rj to the right
hand side of the equation, and dividing through all terms of each equation by rs
(the power of the last-labeled person), to give r*, . . . , r* . Then the first s— 1 equa¬
tions are independent equations in s—1 unknowns(r* , . . . , r* ), and can be solved
by standard means. The r*’s can be transformed into r’s by recognizing that the total
power is defined to be 1.0, and by use of the equation:

(H1 r* + 0^= 10
i=i
Since all r*’s are known, this equation can be solved for rs, and then this used to
solve for r,i since r,ii‘
= rfr

15. This assumes of course, that the only resources used in such exchange are
control over events in this system. If resources from outside the system are used,
then the larger system, including those resources, must be the system for analysis.
16. Henceforth, the term “social welfare” will be used to mean social welfare
relative to the distribution of constitutional control.

REFERENCES
Arrow, Kenneth J., Social Choice and Individual Values. New York: John
Wiley and Sons, Inc., 1951.
Buchanan, James M., “The Relevance of Pareto Optimality,” The Journal
of Conflict Resolution, v. 6, 1962, pp. 341-354.
Buchanan, James M., and Tullock, Gordon, The Calculus of Consent. Ann
Arbor, Michigan: University of Michigan Press, 1962.
Harrod, R. F., “Scope and Method of Economics,” The Economic Journal,
v. 48, 1938, pp. 383-412.
Hicks, John R., “The Foundation of Welfare Economics,” Economic
Journal, v. 49, 1939, pp. 696-712.
Kaldor, N., “Welfare Propositions of Economics and Interpersonal Com¬
parisons of Utility,” Economic Journal, 1939, v. 49, pp. 549-552.
Little, I. M. D., A Critique of Welfare Economics, second edition. Oxford:
Oxford University Press, 1957.
Little, I. M. D., “Social Choice and Individual Values,” The Journal of
Political Economy, v. 60, 1952, pp. 422-432.
JAMES S. COLEMAN 439

Musgrave, Richard, The Theory of Public Finance. New York: McGraw-


Hill, 1959.
Ramsey, Frank, The Foundations of Mathematics and other Logical Essays.
New York: The Humanities Press, 1950.
Robbins, Lionel, “Inter-personal Comparisons of Utility,” The Economic
Journal, v. 48, 1938, pp. 635-641.
Samuelson, P. A., “Evaluation of Real National Income,” Oxford Eco¬
nomic Papers, N.S. v. 2, 1950, p. 1-29.
Scitovsky, T., “A Note on Welfare Propositions in Economics,” Review of
Economic Studies, 1941, pp. 77-88.
Suppes, Patrick, and Winet, Muriel, “An Axiomatization of Utility Based
on the Notion of Utility Differences,” Management Science, 1955, v. 1,
pp. 259-270.
Von Neumann, John, and Morgenstern, Oskar, The Theory of Games and
Economic Behavior, Princeton: Princeton University Press, 1947.
Wicksell, Knut, “A New Principle of Just Taxation,” translated and re¬
printed in Musgrave, Richard A. and Alan T. Peacock, Classics in the
Theory of Public Finance. London: Macmillan, 1958, pp. 72-118.
COERCION
Robert Nozick

This study1 of coercion is intended as a preliminary to a longer study of


liberty, whose major concerns will be the reasons which justify making
someone unfree to perform an action, and the reason why making someone
unfree to perform an action needs justifying. Though coercion is intimately
connected with liberty (some writers capsulize freedom as absence of
coercion), it does not exhaust the range of nonliberty or unfreedom. In
particular, being coerced into not doing an act is neither a necessary nor
a sufficient condition for being unfree to do it. That it is not necessary is
shown by the following examples:
(a) A person robs a bank and is caught and punished. If he knew for
sure he would be caught and punished for robbing the bank, he
would not do so, but he does not know this and so robs the bank.
He was unfree to rob the bank, though he was not coerced into
not doing so.
(b) I was not coerced into not murdering a member of the audience at
Columbia when I read this paper, though I was unfree to do so.
(c) If I lure you into an escape-proof room in New York and leave you
imprisoned there, I do not coerce you into not going to Chicago
though I make you unfree to do so.
That being coerced into not doing act A is not a sufficient condition for
being unfree to do A is shown by the following example: You threaten to
get me fired from my job if I do A, and I refrain from doing A because
of this threat and am coerced into not doing A. However unbeknownst to
me you are bluffing; you know you have absolutely no way to carry out
this threat, and would not carry it out if you could. I was not unfree to
do A (no doubt I thought I was), though I was coerced into not doing A.
But though coercion does not exhaust the notion of unfreedom, it is ob¬
viously closely connected to it.2
This paper attempts to clarify the concept of coercion, and some related
concepts. Though this is an interesting and intriguing task, I do not pursue
it for its own sake. I am primarily interested in the uses to which such a
clarification can be put; the questions one will be in a better position to
answer given this clarification (other than questions like: what are the
necessary and sufficient conditions for P’s coercing Q into not doing A, or
440
ROBERT NOZICK 441

for P’s threatening Q). I shall not be able here to get to these further
questions, and shall be engaged largely in tool-sharpening rather than in
tool use.
One final preliminary remark, or warning, or apology. This study of
coercion is an exploratory study, and is meant to raise questions and sug¬
gest problems. To many of these questions and problems I propose tenta¬
tive answers and solutions, but some are left open. I would be happier if
I could answer or solve them all. But in philosophy questions and problems
often outlive specific proposed answers and solutions. Unfortunately, so
it was in the course of writing this paper.

CONDITIONS FOR COERCION

I shall begin by considering an account of coercion obtained from com¬


bining some things said on this subject in Hart and Honore’s Causation in
the Law with some remarks of Hart in his The Concept of Law.3 According
to this account, person P coerces person Q into not doing act A if and
only if
(1) P threatens to do something if Q does A (and P knows he’s making
this threat).
(2) This threat renders Q’s doing A substantially less eligible as a course
of conduct than not doing A.
(3) P makes this threat in order to get Q not to do A, intending that Q
realize he’s been threatened by P.
(4) Q does not do A.
(5) P’s words or deeds are part of Q’s reason for not doing A.4
Conditions 1-5 do not appear to be sufficient for coercion. For ex¬
ample, P threatens Q, saying that if Q performs a particular action, a
rock will fall and kill him. P thinks Q knows of his (P’s) infamous pro¬
cedure of murdering people, but Q thinks that P is telling him about some
strange natural law that holds independently of human action, namely
whenever someone performs this action, he gets killed by a falling rock.
That is, 0 understands what P says, not as a threat but as a warning. If Q
refrains from performing the action, P has not coerced him into not doing
it, even though the five conditions are satisfied. This suggests that we add
as a further condition:
(6) Q knows that P has threatened to do the something mentioned in 1,
if he, Q, does A.5
It is not clear that the conditions thus far listed are sufficient. You
threaten to do something if I do A, thinking that I don’t want this some¬
thing done. But in fact, I don’t mind it or even slightly want it. However
I realize that you must feel very strongly about my doing A, since you’ve
442 3. THE CONSTRUCTION OF THE GOOD

threatened me, and that you will be very upset if I do A (not that you
will choose to be upset to punish me for doing A). Since I don’t want
you to be upset, I refrain from doing A. You did not coerce me into not
doing A, though it seems that the conditions listed are satisfied; in par¬
ticular it seems that the relevant conditions 5 and 2 are satisfied. Or, if
they can be so interpreted so that they’re not satisfied, it would be well to
make this interpretation explicit, replacing 5, 1 and 2 by:

(5') Part of Q’s reason for not doing A is to avoid (or lessen the likelihood
of) the consequence6 which P has threatened to bring about or have
brought about.7

(1') P threatens to bring about or have brought about some consequence if


Q does A (and knows he’s threatening to do this).

(20 A with this threatened consequence is rendered substantially less eligible


as a course of conduct for Q than A was without the threatened con¬
sequence.

Must P make the threat in order to get Q not to do A? Must condition 3


be satisfied? In normal situations it will be satisfied; e.g., a highwayman
says “If you don’t give me your money, I’ll kill you,” making this threat
in order to get me to give him my money. But suppose that we are con¬
ducting an experiment for the Social Science Research Council, to study
people’s reactions in the highwayman situation. We don’t care how he reacts
to our threat (if he gives over the money we must turn it over to the SSRC;
if he resists we are empowered to kill him and, let us suppose, have no
moral scruples about doing so). We do not say “your money or your life”
in order to get him to give us his money, but in order to gather data. We
might even suppose that I think him very brave and have bet with you that
he’ll resist and be killed. After making the bet, I want him not to hand over
the money, and I don’t make the threat in order to get him to hand it
over. In the grip of fear and trembling, he hands over the money. Surely we
coerced him into doing so. This suggests replacing 3 by a more complicated
condition:

(Part of) P’s reason for deciding to bring about the consequence or have
it brought about, if Q does A, is that P believes this consequence worsens
Q’s alternative of doing A (i.e. that P believes that this consequence
worsens Q’s alternative of doing A, or that Q would believe it does).8

The SSRC example satisfies this condition, since (part of) the researchers’
reason for deciding to kill Q if he doesn’t turn over the money is that they
believe this consequence worsens Q’s alternative of not giving them the
money.
But the condition formulated is not broad enough, for we want to cover
cases where P has not decided to bring about the consequence if Q does A,
but is bluffing instead, or neither intends nor intends not to bring it about if
Q does A. This suggests disjoining another condition with the one above:
ROBERT NOZICK 443

If P has not decided to bring about the consequence, or have it brought


about, if Q does A, then (part of) P’s reason for saying he will bring
about the consequence, or have it brought about, if Q does A is that
(P believes) Q will believe this consequence worsens Q’s alternative of
doing A.9

One is tempted to say that this disjunctive condition is superfluous, be¬


cause it is built into the notion of threatening, and hence follows from
condition 1'. That is, one is tempted to say that if this condition is not
satisfied, if P’s reasons or motives are not as described, then P has not
threatened Q. I shall have more to say about this later.
The conditions listed still do not appear to be sufficient. Consider cases
where Q wants to do A in order to bring about x, and P says that if Q
does A, he (P) will do something which just prevents A from bringing
about x. This makes A substantially less eligible as an alternative for Q
(Q now, we may suppose, has no reason to do A), and the other condi¬
tions may well be satisfied. Yet, at least some cases of this sort (“If you
say another word, I shall turn off my hearing aid”) are not cases of co¬
ercion.10
Cases of this sort suggest the following condition:

(7) Q believes that, and P believes that Q believes that, P’s threatened
consequence11 would leave Q worse off, having done A, than if Q didn’t
do A and P didn’t bring about the consequence.12

In the application of this condition, in deciding how well or poorly off Q


is having done A and having had his purpose x thwarted, one must ignore
Q’s wasted effort, humiliation at having failed to bring about x, and (in
some cases) Q’s foregone opportunities. Similarly in deciding how well or
poorly off Q would be not doing A and not having P bring about the con¬
sequence, one must ignore any regret Q might feel at not doing A.13
According to our account of “P coerces Q into doing A,” the following
cases are not cases in which P coerces Q into doing A.
(1) Q mishears P as having said “Your money or your life” and hands
over his money, but P said something else, or said this as a ques¬
tion about something he thought Q said, etc.
(2) P doesn’t speak English, but has picked up the sentence “Your
money or your life” from a movie, though he does not know what
it means. To be friendly, P utters this one sentence to Q who is
sitting next to him in a bar (perhaps while showing Q his unusual
knife for Q to admire). Q hands over his money.
(3) Q walks into a room, and unbeknownst to him there is a tape re¬
corder in the next room playing part of the soundtrack of a movie.
Q hears “Put all of your money on the table and then leave, or
I’ll kill you.” Q puts his money on the table, and leaves.
I suggest that in these cases, though Q feels coerced and thinks he is
coerced, P does not coerce Q into giving over the money. (In the third
444 3. THE CONSTRUCTION OF THE GOOD

case there is no plausible person P to consider.) Those who refuse to ac¬


cept this might hold the view that though P does not coerce Q into giving
over the money, nonetheless Q is coerced into giving over the money.
Such a person would reject the view that “Q is coerced into doing A” is
equivalent to “there is a P who coerces Q into doing A,” and perhaps
suggest that Q is coerced into doing A if and only if

(1) There is a P who coerces Q into doing A

or (2) Q is justified in believing that there is a P who has threatened to bring


about a consequence which significantly worsens his alternative of not
doing A (and that P has the appropriate reasons and intentions), and
(part of) Q’s reason for doing A is to avoid or lessen the likelihood of
this consequence he believes was threatened.

I should mention that a threat need not be verbally expressed; it may


be perfectly clear from actions performed what the threat is, or at least that
something undesirable will occur if one doesn’t perform some appropriate
action. For example, members of a street gang capture a member of a
rival gang and ask him where that gang’s weapons are hidden. He refuses
to tell, and they beat him up. They ask again, he refuses again, they beat
him again. And so on until he tells. He was coerced into telling. His
captors didn’t have to say, “if you don’t tell us we will continue to beat you
up or perhaps eventually do something worse.” This is perfectly clear to
all involved in the situation. In many situations the infliction of violence is
well understood by all parties to be a threat of further infliction of violence
if there is noncompliance. Nothing need be said.14 It may be for reasons
such as this that some writers (e.g. Bay) say that all infliction of violence
constitutes coercion. But this is, I think, a mistake. If a druken group
comes upon a stranger and beats him up or even kills him, this need not
be coercion. For there need have been no implicit threat of further violence
if the person didn’t comply with their wishes, and it would indeed be difficult
for this to be the case if they just come upon him and kill him.15
There is another type of situation very similar to the one we have thus
far been concerned with, for which similar conditions can be offered. I
have in mind cases where no one threatens to inflict some damage on Q
if he does A, but someone sets things up so that damage is automatically
inflicted if Q does A. It’s not that if you do A, I will bring about a con¬
sequence which you consider to be bad, but rather that I now do something
(the doing of which is not conditional upon your doing A) which is such
that if you do A after I have done this thing, there will be a consequence
which you consider to be bad.15 Though in such situations a person is
deterred from doing something, it is not obvious to me that he is coerced
into not doing it. If it is coercion then the account of coercion would say
that P coerces Q into not doing A if and only if either of the two sets of
conditions is satisfied.17
ROBERT NOZICK 445

I suggest that it is as a case of this sort of situation, rather than the one
discussed earlier, that we are to understand the following: some adult’s
mother says to him, “If you do A I’ll have a heart attack, or the prob¬
ability = p that I’ll have a heart attack.” I have in mind a case where
the mother does not choose to have a heart attack if her son does A or
to do something which will bring on or raise the probability of a heart
attack. She just knows she will (or that the probability = p). It seems to me
that the mother’s statement is not plausibly construed as a threat to
(choose to) do something or bring about a consequence if her son does A.
To use a distinction which will be discussed later, what the mother issues
is not a threat but rather a nonthreatening warning. If we look just at the
first sort of situation, we will conclude that the mother did not coerce the
son into not doing A. But this example can plausibly be viewed as a case
of the second sort of situation, in which before Q does A, P does something,
making this known to Q, which worsens Q’s alternative of doing A. And
if this counts as coercion, then the mother may coerce the son. We should
look, in this case, at the mother’s act, prior to her son’s doing A, of telling
him that she will or probably will have a heart attack if he does A. We
may suppose that without her announcement the consequence of her son’s
doing A is some probability of her having a heart attack and some prob¬
ability of his feeling guilty (a function of the probability of his realizing
why she died and the probability that he will feel guilty anyway because
he did something she didn’t like and then she died) and some probability
of A’s having quite nice consequences. And we may suppose that after
the mother’s announcement the consequences of his doing A are changed
significantly. For now there is some probability of her dying and his feeling
enormously guilty (because he ignored her warning), and even if she won’t
die because of his doing A, if he does A he will worry over this possibility,
feel guilty about doing something he knows upsets her, etc. Her act of
making her announcement before he did A worsened the consequence
of his doing A. If we suppose, furthermore, that one of her reasons for
making the announcement was to worsen the consequences, and that one
of his reasons for not doing A was this worsening of consequences, then
we have a situation of the second sort. And if this sort of situation counts
as coercion, the son was coerced into not doing A.18

NONCENTRAL CASES OF COERCION

I have thus far concentrated upon the central part or core of the notion
of coercion, and, in order to avoid too many complications all at once,
have spoken of necessary and sufficient conditions for coercion (period).
However, I believe that there are further cases of coercion which do not
themselves satisfy the conditions thus far discussed, but which are cases
446 3. THE CONSTRUCTION OF THE GOOD

of coercion by virtue of standing in certain specifiable relations to central


cases of coercion.19 It is a task of some intricacy to get these relations
just right. The statements which follow are meant to indicate areas in
which principles must be formulated. It is not claimed that these state¬
ments are the formulations which one would eventually arrive at, nor is it
claimed that the statements which follow exhaust the areas for which prin¬
ciples must be formulated. Let me repeat: the statements below are meant
to indicate areas for which principles must be formulated and are not put
forward as the correct formulation of principles in these areas. I would
expect that, after such principles are adequately formulated, a recursive
definition of “P coerces Q into doing A” would be offered, which would
begin with the conditions for the central cases discussed earlier.

(1) If P coerces Q into doing A, and “A” contains as a proper part the
referring expression “r^” and “B” is obtainable by substituting the refer¬
ring expression “r2” for %” in “A”, and “r2” and %” have the same
reference, and “rx” occurs transparently in “Q does A,” then P coerces
Q into doing B.20

(2) If P coerces Q into doing A, and it is a necessary truth that if anyone


does A he does B, and it is not a necessary truth that if anyone does
anything he does B, then P coerces Q into doing B.

(3) If P coerces Q into doing A, and it is a nomological truth that if anyone


does A he does B, and it is not a nomological truth that if anyone does
anything he does B, then P coerces Q into doing B.

(4) If P coerces Q into doing A, and if the only way anyone can do A
is by doing either Bx or B2 or, . . . , or Bn, then P coerces Q into doing
Bj or B2 or, . . . , Bn.21

(5) If P coerces Q into doing A, and the only way in which Q can do A
is by doing either Bx or B2 or, . . . , or Bn, then P coerces Q into doing
B, or B2 or, .. . , or Bn.22

(6) If Q can do A only by doing B, or B2 or, . . . , or Bn, and Q sets out to


do A (intending to do it) because of P’s threat of a harmful consequence
if Q doesn’t do A, and Q does one of the B; in order to do A, then Q
is coerced into doing Bx or B2 or, . . . , or Bn, even if Q does not do A
(whether because he’s prevented from doing A or because he’s changed
his mind about doing it).

(7) If P coerces Q into doing Bx or B2 or, . . . , or Bn, and Bx is the best


of the Bj’s, the only one of the Bj’s it would be reasonable to do, etc.,
and Q does Bj for this reason, then P coerces Q into doing Bx.

(8) If P coerces Q into doing A, and x is a consequence of Q’s A, and


--, then P coerces Q into bringing about x.
(What further conditions are needed in the blank?)23

In order to avoid concluding that Q was coerced into doing B when Q


does A (partly) because of the threat, and does B (which stands in one
of the stated relations to A) for some other reason, we must add to the
ROBERT NOZICK 447

antecedent of each of these statements the qualification that (part of)


the reason Q does B (or, Bi, or B2, or . . . , or Bn) is to avoid or lessen the
likelihood of P’s theatened consequence (if Q does A).

THREATS AND OFFERS

The notion of a threat has played a central role in what has been said
thus far. In this section we shall consider the differences between threats
and offers, and in the next section we shall consider the differences between
threats and warnings.
If P offers Q substantially more money than Q is earning at his current
job to come to work for P, and Q accepts because he wants to increase
his income, has P coerced Q into working for him? Some writers (Hale,
Bay) would say that P has; the threat being “come to work for me or I
won’t give you the money.”24 On this view, every employer coerces his
employees, every employee his employer (“give me the money or I won’t
work for you”), every seller of an object coerces his customers (“give me
the money or I won’t give you the object”), and every customer the
person from whom he buys. It seems clear that normally these aren’t
cases of coercion. Offers of inducements, incentives, rewards, bribes, con¬
sideration, remuneration, recompense, payment do not normally constitute
threats, and the person who accepts them is not normally coerced.
As a first formulation, let us say that whether someone makes a threat
against Q’s doing an action or an offer to Q to do the action depends on
how the consequence he says he will bring about changes the consequences
of Q’s action from what they would have been in the normal or natural
or expected course of events. If it makes the consequences of Q’s action
worse than they would have been in the normal and expected course of
events, it is a threat; if it makes the consequences better, it is an offer.25
The term “expected” is meant to shift between or straddle predicted and
morally required,26 This handles pretty well the clear cases of threats and
offers. Let us see how it fares with more difficult examples.
(a) P is Q’s usual supplier of drugs, and today when he conies to Q he says
that he will not sell them to Q, as he normally does, for $20, but rather
will give them to Q if and only if Q beats up a certain person.

(b) P is a stranger who has been observing Q, and knows that Q is a drug
addict. Both know that Q’s usual supplier of drugs was arrested this
morning and that P had nothing to do with his arrest. P approaches Q
and says that he will give Q drugs if and only if Q beats up a certain
person.

In the first case, where P is Q’s usual supplier of drugs, P is threatening


not to give Q the drugs. The normal course of events is one in which P
supplies Q with drugs for money. P is threatening to withhold the supply,
448 3. THE CONSTRUCTION OF THE GOOD

to deprive Q of his drugs, if Q does not beat up the person. In the second
case, where P is a stranger to Q, P is not threatening not to supply Q
with drugs; in the normal course of events P does not do so, nor is P
expected to do so. If P does not give Q the drugs he is not withholding drugs
from Q nor is he depriving Q of drugs. P is offering Q drugs as an induce¬
ment to beat up the person. Thus in the second case, P does not coerce
O into beating up the person, since P does not threaten Q. (But the fact
that P did not coerce Q into beating up the person does not mean that it
would not be true for Q to say, in some legitimate sense of the phrase:
“I had no choice.”)
There is a further point to be considered about the first case in which
P is Q’s usual supplier of drugs. In addition to threatening to withhold the
drugs if Q doesn’t beat up a certain person, hasn’t P made Q an offer?
Since in the normal and expected course of events Q does not get drugs
for beating up the person, isn’t this a case in which P then offers Q drugs
as an incentive to beat up the person? And if P has made this offer, why
do we view the overall situation as one in which P threatens Q, rather than
as one in which P makes Q an offer? We have here a situation in which P
takes a consequence viewed as desirable by Q (receiving drugs) off one
action (paying $20) and puts it onto another action (beating up the per¬
son). Since Q prefers and P believes that Q prefers paying the money and
receiving the drugs to beating up the person and receiving the drugs, and
since Q would rather not beat up the person, P’s statement is a threat to
withhold the drugs if Q doesn’t beat up the person, and this threat pre¬
dominates over any subsidiary offer P makes for Q to beat up the person,
making the whole situation a threat situation.
But instead of subtracting a desirable consequence from one of Q’s
actions and tagging the same consequence onto another of Q’s actions,
P may subtract a desirable consequence C from one of Q’s actions Ax and
add a more desirable consequence C' onto another action A2 available to
Q. For example, the dope peddler might say to Q, “I will not give you drugs
if you just pay me money, but I will give you a better grade of drugs, with¬
out monetary payment, if you beat up this person.” It seems plausible to
think that as one increases the desirability of C' to Q, at some point the
situation changes from one predominantly involving a threat to deprive
Q of C if he does Ax (doesn’t do A2) to one which predominantly involves
an offer to Q of C' if he does A2. And it seems plausible to claim that this
turning point from threat to offer, as one increases the value of C' for Q,
comes at the point where Q begins preferring A2 and C' to A, and C (stops
preferring the latter to the former?).27
The following principle embodies this claim, and also covers the case
where it is the same consequence which is switched from one action to
another as in the earlier example. It also is meant to apply to obvious
ROBERT NOZICK 449

mixtures of threats and offers, e.g., “If you go to the movies I’ll give you
$10,000. If you don’t go, I’ll kill you.”

If P intentionally changes the consequences of two actions Aj and A2 avail¬


able to Q so as to lessen the desirability of the consequences of A1( and so
as to increase the desirability of the consequences of A2, and part of P’s
reason for acting as he does is to so lessen and increase the desirabilities of
the respective consequences then

(a) This resultant change predominantly involves a threat to Q if he does


Ax if Q prefers doing the old A1 (without the worsened consequences)
to doing the new A2 (with the improved consequences).

(b) This resultant change predominantly involves an offer to Q to do A2


if Q prefers doing the new A2 (with the improved consequences) to the
old Ax (without the worsened consequences).

This principle ties in nicely with something we shall say later. For when
the change predominantly involves a threat, Q would normally not be
willing to have this change made (since he’d rather do the old Ax than
either of the two alternatives after the change), whereas when the change
predominantly involves an offer, Q would normally be willing to have the
change made (since he’d rather do the new A2 than the old Ax, and if he
prefers doing one of the old alternatives (Ax) to doing the new A2, he can
still do it,) I shall claim later that this willingness or unwillingness to make
the change marks an important difference between offers and threats.28
If a statement’s being a threat or an offer depends upon how the carry¬
ing out of the statement affects the normal or expected course of events,
one would expect that there will be situations where it is unclear whether a
person is making a threat or an offer because it is unclear what the normal
and expected course of events is. And one would expect that people will
disagree about whether something is a threat or an offer because they dis¬
agree about what the normal and expected course of events is, which is
to be used as a baseline in assessing whether something is a threat or an
offer. This is indeed the case.
Consider the following example. Q is in the water far from shore,
nearing the end of his energy, and P comes close by in his boat. Both
know there is no other hope of Q’s rescue around, and P knows that Q is
the soul of honesty and that if Q makes a promise he will keep it. P says
to Q “I will take you in my boat and bring you to shore if and only if you
first promise to pay me $10,000 within three days of reaching shore with
my aid.” Is P offering to take Q to shore if he makes the promise, or is he
threatening to let Q drown if Q doesn’t make the promise? If one views
the normal or expected course of events as one in which Q drowns without
P’s intervention, then in saying that he will save Q if and only if Q makes
the promise, P is offering to save Q. If one views the normal or expected
450 3. THE CONSTRUCTION OF THE GOOD

course of events as one in which a person in a boat who comes by a


drowning person, in a situation such as this, saves him, then in saying that
he will save Q if and only if Q makes the promise, P is threatening not to
save Q. Whether P’s saying that he will save Q if and only if Q makes
the promise is an offer to save Q or a threat not to save Q depends upon
what the normal or expected course of events is.
Since it is likely to be clear to the reader which course of events he
wants to pick out as normal and expected as the background against which
to assess whether P’s statement is an offer or a threat (namely, the one that
makes it a threat) we should sharpen the example. Suppose in addition to
the foregoing that P knows that Q has greatly wronged P (or others), but
that Q cannot be legally punished for this (no law covered the wrong, a
legal technicality, the statute of limitations has run out, or some such
thing). Or P knows that Q will go on to do monstrous deeds if rescued.
In some such situations it will be unclear what P is morally expected to do,
and hence unclear whether his statement is a threat or an offer. For other
such situations it will be clear that P is morally expected to let Q drown,
and hence his statement will be an offer.29
Thus far we have considered threats as introducing certain deviations
from the normal and expected course of events. The question arises as to
whether the normal or expected course of events itself can be coercive.
Suppose that usually a slave owner beats his slave each morning, for no
reason connected with the slave’s behavior. Today he says to his slave,
“Tomorrow I will not beat you if and only if you now do A.” One is
tempted to view this as a threat, and one is also tempted to view this as
an offer. I attribute these conflicting temptations to the divergence between
the normal course of events, in which the slave is beaten each morning,
and the (morally) expected course of events, in which he is not. And I
suggest that we have here a situation of a threat, and that here the morally
expected course of events takes precedence over the normal course of
events in assessing whether we have a threat or an offer.30
One might think that in deciding whether something is a threat or an
offer, the (morally) expected course of events always takes precedence
over the normal or usual course of events, where these diverge. It is not
obvious that this is so. I have in mind particularly the example mentioned
earlier, where your normal supplier of dope says that he will continue to
supply you if and only if you beat up a certain person. Here, let us sup¬
pose, the morally expected course of events is that he doesn’t supply you
with drugs, but the course of events which forms the background for
deciding whether he has threatened you or made you an offer is the normal
though not morally expected course of events (in which he supplies you
with drugs for money); it is against this background that we can obtain the
consequence that he’s threatened you.
Thus, in both the slave and the addict examples the normal and morally
ROBERT NOZICK 451

expected courses of events diverge. Why do we pick one of these in one


case, and the other in the other as the background against which to assess
whether we have a threat or an offer? The relevant difference between these
cases seems to be that the slave himself would prefer the morally expected
to the normal course of events whereas the addict prefers the normal to the
morally expected course of events.31 It may be that when the normal and
morally expected courses of events diverge, the one of these which is to
be used in deciding whether a conditional announcement of an action con¬
stitutes a threat or an offer is the course of events that the recipient of the
action prefers.32
I have raised the question of whether the normal and expected course
of events itself can be coercive, and was led to consider cases where the
normal and (morally) expected courses of events diverged. I now would
like to consider this question again, for cases where they do not diverge.
Can P, by saying that he will bring about a consequence if 0 does A
(where this consequence is such that if Q does A, P would bring it about
in the normal and (morally) expected course of events), coerce Q into
not doing A? Suppose that in the normal and morally expected course of
events, people get punished for theft. Aren’t some people coerced into
not stealing by the legal apparatus?
One might say that if a type of action or consequence is itself part of
the normal and expected course of events if Q does A, one should use the
normal and expected course of events minus this type of action or conse¬
quence as a background against which to assess whether a statement is a
threat. If the consequences of an action would be worse, if the statement
is carried out, than they would be in this new course of events (i.e. the
normal and expected course of events without the type of action or con¬
sequence) then the statement is a threat. But who knows what the world
would be like if there was no punishment for crimes? It might well be that
things would be so bad that the institution of punishing crimes would im¬
prove the consequences of almost all actions, and hence count, according
to this suggestion, as making offers to people.
An alternative procedure seems more reasonable; namely, to consider
the normal and expected course of events, if Q does A, without P’s par¬
ticular act or without the particular consequence P will bring about, and
against this background assess whether P’s statement that if Q does A he
will do a particular act or bring about a particular consequence constitutes
a threat (i.e. whether P’s statement, if carried out, makes Q’s A worse
than it would be in this new course of events).
There remain some problems about knowing what the course of events
would be without this act, but these seem manageable. On this view, even
though in the normal and expected course of events Q gets punished for
theft, the statement that he will be punished for theft counts as a threat
since the act of punishment, if Q steals, unfavorably affects the conse-
452 3. THE CONSTRUCTION OF THE GOOD

quences of one act of Q’s (stealing) against the background of the normal
and expected course of events minus this act of punishment.33
According to the account offered earlier of (the first sort of) coercion,
threats are necessary for coercion. One might extend the account to in¬
clude some offers, if there were clear situations in which Q is coerced into
doing A even though Q does A because P offered to do B if Q did A.
Despite my inclination to say that one is never coerced when one does
something because of an offer (unless in the case discussed earlier, the
slaveowner is making an offer to his slave), there is one sort of case, where
the offer is closely tied to coercion or attempted coercion, which I find it
difficult to decide about. Suppose that P knows that Q has committed a
murder which the police are investigating, and knows of evidence sufficient
to convict Q of this murder. P says to Q, “If you give me $10,000 I will
not turn over the information I have to the police.” Let us assume that
were P unable to contact Q and present his proposal he would turn the
information over to the police. Furthermore, in this situation P is (morally)
expected to turn the information over to the police. So in the normal and
expected course of events, P turns the information over to the police
(whether or not Q gives him $10,000). It would seem, therefore, that P is
offering not to turn the information over to the police, rather than threaten¬
ing to turn it over. Yet one is strongly tempted to say, when Q pays P
$10,000 because he accepts the offer, that Q was coerced by P into paying
the $10,000.34
If the following principle were correct, then this would be a case of
coercion:

If P offers to refrain from aiding the threatener of a coercive conse¬


quence35 for Q’s A from bringing about this consequence, in exchange for
Q’s doing B, and if the credible threat of this consequence36 if Q didn’t
perform B would coerce Q into doing B, then when Q does B because
of the offer, Q was coerced into doing B.

A case similar to the previous one, is one in which the police arrest Q
for a crime, believing that he has committed it and having sufficient evidence
to convict him of it. In the course of questioning Q, they come to believe
that Q knows who has committed some other crime, and they say that Q
will not be prosecuted if and only if he tells them who has committed this
other crime. Since if the police did not think Q knew who committed the
other crime they would have prosecuted, and since they are morally ex¬
pected to have him prosecuted, the police have offered not to have Q
prosecuted rather than threatened to have him prosecuted. If Q names the
person who committed the other crime, in order to escape being prosecuted,
some are strongly tempted to say that he was coerced into giving the in¬
formation. The above principle would yield this consequence. Though I
do not deny that one may say, in some legitimate sense of these expressions,
“Q was forced to do what he did,” “Q had no choice,” I am unable to
ROBERT NOZICK 453

decide whether, in the above cases, Q was coerced into doing so, and I
leave this an open question.
The two previous cases are cases where P is morally expected or re¬
quired to do the act, and would normally do so (turn the murderer over
to the police, have Q prosecuted). It is worth mentioning cases where P
has a legal and moral right to do the act, but would not decide to do so
(even if Q didn’t do A) were he not trying to get Q to do A. For example,
P has a right to build on his land blocking Q’s view, or foreclose Q’s
mortgage, or bring legal action against Q (on a valid and enforcible claim)
but would not decide to do so (it’s not worth the trouble, P has no pressing
need for funds, etc.) were it not for his wanting Q to do A. P tells Q that
unless he does A, he (P) will build on his land, foreclose Q’s mortgage,
bring legal action against Q, etc. Since P’s action is not part of the morally
expected course of events (not that P is morally expected not to do it) and
since in the normal course of events P wouldn’t do it, the account yields
the result that one would wish: In these cases P is threatening to perform
his actions rather than offering not to do so.

THREATS AND WARNINGS

In the section on Conditions for Coercion, because of the example of


the SSRC people saying “your money or your life,” we rejected the con¬
dition that P says or does what he does in order to get Q not to perform
some particular action A. We substituted instead a condition requiring
that (part of) P’s reason for deciding to bring about the consequence if
Q does A (or, if he hasn’t decided, saying that he has) is that (P believes
that) this worsens Q’s alternative of doing A (or that Q would believe it
does). And we mentioned the view that this is part of the notion of making
a threat. This view illuminates the fact that some statements about one’s
future actions if Q does A, are not threats even though the acts you’ve
stated you would do if Q does A worsen Q’s alternative of doing A.
Such statements about one’s future actions I shall call nonthreatening
warnings (for short, just “warnings”). The distinction between threats and
nonthreatening warnings is crucial to some questions that arise in the law.37
For example, an election is about to be held in a factory to determine
whether the employees will be represented by a labor union. The owner
of the factory announces to his employees that if the union wins the election,
he will close his factory and go out of business. Has he threatened the em¬
ployees with loss of their jobs if the union wins, or merely warned them
what will happen if the union wins? If a majority of the employees would
have voted for the union if not for the announcement, and the union lost
because of the announcement, were the employees coerced by their em¬
ployer into rejecting the union?
454 3. THE CONSTRUCTION OF THE GOOD

We may view this situation as a game, represented by the following


matrix:

Employees
I. union wins II. union loses

A. Stays in business (a)


(b)

Employer

B. Goes out of business (c) (d)

The employees make the first move (pick a column) and the employer
makes the second (picks a row) knowing what move the employees have
first made. That is, first the employees choose to be represented by the
union or not, and then the employer, knowing what choice his employees
have made, decides to stay in business or go out of business. I shall assume
that each of the members of some particular majority of the employees
preferentially ranks the outcomes as follows:

(b)

(a)

(c)
(d)

and shall call this the preference ordering of the employees.38 There seem
to me to be at least four cases worth considering,39 corresponding to four
different preference rankings of these alternatives by the employer. These
are:

(1) (a) (2) (a) (3) (a) (4) (c) - (a)


0>) (c) - (b) (c) (b)
(c) (d) (b)
(d)

I assume, for each case, that the employer knows the preference ranking
of the employees.
Case 1: The employer, if he were sure that the union would win the
election, would not make his announcement, since he prefers continuing in
business though the factory is unionized, to going out of business. That is,
he prefers (b) to (c). However he commits himself to going out of business
if the factory is unionized, and announces this decision (that is, rules out
(b)), in the hope that this will lead his employees to reject the union. By
ruling out (b) beforehand he leaves his employees (a), (c), and (d)
among which to choose, and since of these (a) is highest in the preference
ranking of the employees, they will presumably then act so as to realize
ROBERT NOZICK 455

(a) , and this is the alternative which the employer ranks most highly. The
employer is committing himself beforehand, for strategic reasons, to do
something (B) in a situation (I) such that had he not committed himself
to this he would be better off doing something else (A) in that situation
when and if it arose. It seems clear that in this case, when the employer
announces that he will go out of business if the union wins the election
he is threatening his employees.40 For in the normal course of events, he
does not go out of business, given his preferences, if the employees vote
for the union. Hence he’s announcing that he will depart from the normal
course of events in a way to the detriment of his employees, if they elect the
union.41 And also (part of) his reason for deciding to go out of business if
the union wins is that he believes that this worsens his employees alter¬
native of electing the union (and he thereby hopes to influence them to
reject the union). Hence his announcement is a threat, and he has threatened
to go out of business if the union wins the election.
Case 2: The employer announces that he will go out of business if the
union wins the election, thus ruling out (b), and leaving the employees a
choice between (a) and (c). He makes this announcement hoping that
because of it, his employees will reject the union. Note that in this case,
unlike the first one, there is not something the employer would rather do
than go out of business if the union wins the election. (For he is indifferent
between (c) and (b), whereas in the first case, the employer preferred
(b) to (c).) Though there aren’t exactly the same strategic considerations as
in the first case, still strategic considerations are involved. And this employer,
too, has threatened his employees. For in the normal or expected course of
events he wouldn’t have made his announcement, and would, being in¬
different between (c) and (b), decide whether or not to stay in business
if and when the union won the election. We may suppose that before he
decides there is some nonzero probability of his deciding to close the
factory, and some nonzero probability of his staying in business. This
being what one would expect in the normal course of events, his announce¬
ment and commitment to going out of business, for sure, if the union wins
the election, changes the normal or expected course of events in the typical
manner of a threat. For his being almost certain to go out of business
is worse, from his employees’ viewpoint, than there being some probability
of his going out of business and some probability of his remaining in
business. Since part of his reason for deciding to go out of business if the
union wins is that (he believes) this worsens his employees’ alternative of
electing the union, this constitutes a threat.
But if in this case the employer does not announce that he will close if
the union wins (which would be a threat), but instead announces truth¬
fully that he is indifferent between closing and staying in business if the
union wins and will decide what to do afterwards (and if his employees
voting for the union in the face of this announcement would not anger him
456 3. THE CONSTRUCTION OF THE GOOD

and yield a larger probability of his closing than if the announcement


hadn’t been made), then the case may be significantly different. This issue
is also raised by case 3, and is treated in the discussion of it.
Case 4: The employer tells his employees what his preferences are, and
that he will go out of business if the union wins. We may suppose that he
doesn’t make his decision or tell them this in order to get them to reject
the union, since he doesn’t care whether they reject it or not. For even if
they elect it, he can go out of business, and there is nothing he prefers to
that. No strategic considerations are involved. He makes his announce¬
ment solely to inform his employees of what will be the consequences of
their action. It was no part of his reason for deciding to go out of business
if the union won, that this worsened his employees’ alternative of electing
the union, and no part of his reason for making the announcement was to
get his employees to reject the union. It seems clear that this is not a case of
coercion, and that the employer has not made a threat (even though his
employees might sorrowfully tell the union representative that they had no
choice) but has rather issued a nonthreatening warning.
Case 3: I have left case 3 for last because it is the most difficult. In this
case, the employer announces what his preference ranking is, and says that
he’ll go out of business if the union wins the election. And indeed, unlike
the employers in cases 1 and 2, but like the employer in case 4, he prefers
going out of business if the union wins the election to staying in business
with the union representing his employees. In the normal course of events,
he would go out of business if the union wins, whether or not he has
previously announced that he would do so. However, unlike the employer
in case 4 but like the employers in cases 1 and 2, he prefers staying in
business without the union to going out of business if the union wins the
election, and makes his announcement in order to get his employees to
reject the union. Has he threatened his employees, or just warned them?
I am inclined to say that he has warned them rather than threatened them.
(Note that a teacher can warn a student that he will fail unless his work
improves, even if the teacher does so in order to get the student to work
harder.) For he does not decide to close if the union wins in order to
worsen the alternative of the union winning, and in making the announce¬
ment he does not worsen this alternative but rather makes known what its
consequences will be. Furthermore, there seems to be no presumption
against this employer’s telling his employees that he will close if the union
wins, whereas there is (normally) a presumption against making threats.42
These cases raise an interesting point relevant to the wider task of
deciding what actions people should be free to do, and what actions they
should be unfree to do. It may be that, even if one picks out a particular
pattern of freedom and unfreedom as optimal, there is no acceptable in¬
stitutional arrangement available to one which realizes this pattern. One
may lack the institutional means to realize exactly the optimal pattern of
ROBERT NOZICK 457

freedom and unfreedom, or certain ways of realizing this pattern, and


publically distinguishing among persons, may be unacceptable to the society
at large. So it may be that, given the available institutional means, the
feasible patterns of freedom and unfreedom among which one must choose
are suboptimal and nondominated patterns of the following forms:
(a) Where some persons are unfree to do acts they should (according to
the optimal pattern) be free to do.

(b) Where some people are free to do acts they should be unfree to do.

(c) Where some people are free to do acts they should be unfree to do, and
some people are unfree to do acts they should be free to do.

A full theory of freedom would, as well as specifying the optimal pattern of


freedom and unfreedom, concern itself with choices among such sub-
optimal patterns.
It may be that we have an example of such a choice here. For one may
wish to make the case 1 and 2 employers unfree to make their announce¬
ments (threats) while leaving the case 3 and 4 employers free to make
their announcements (warnings). However it may be difficult to devise
an institutional arrangement which accomplishes this, for it may be difficult
to distinguish case 1 and 2 employers from case 3 and 4 employers. (Note
that if one attempts to distinguish them, there will be reason for case 1
and 2 employers to lie when asked about their preferences.) The actual
institutional choice one faces may be to forbid all such announcements,
to allow all such announcements, or to use some condition almost coex¬
tensive with the suitable preferences and forbid and allow announcements
on the basis of whether this condition is satisfied, regretfully admitting that
one cannot handle all the cases in the way one would wish.43
Problems of choice among suboptimal patterns arise also and obviously
for paternalistic legislation; legislation which, in order to prevent him from
coming to harm, or to lessen the chance of this, or to enable him to realize
some good, makes someone unfree to perform a particular act.44 The
feasible patterns will often require either making some persons, who do
not need the paternalistic protection and might be better off without it,
unfree to perform some acts, or leaving some people free to do acts they
should be unfree to do (for their own protection). (I spare the reader
examples.) One will often have to choose among such patterns because
there is no realizable and acceptable institutional arrangement which divides
people up just right (according to the optimal pattern) with respect to
particular acts. An important point emerges from this discussion; namely,
that the statement that a particular piece of legislation makes some persons
unfree to perform some acts they should not be unfree to perform (accord¬
ing to the optimal pattern of freedom and unfreedom), even if true, is not
by itself a conclusive objection to the legislation. For it may be that no
feasible and acceptable pattern of freedom and unfreedom is more optimal
458 3. THE CONSTRUCTION OF THE GOOD

(or all other realizeable patterns are less optimal) than the one yielded by
this legislation.
I might mention the statement, which we might call a tip, which stands
to an offer as a non-threatening warning stands to a threat; that is, P’s
statement which points out that P will bring about some consequence if
Q does A which improves Q’s alternative of doing A, though P’s believing
that it improves or that Q would believe it improves Q’s doing A is not
part of P’s reason for deciding to bring about the consequence if Q does A.
Building into the notion of P’s making Q an offer to do A the require¬
ment that (part of) P’s reason for bringing about the consequence if Q
does A is to improve Q’s alternative of doing A, illuminates the following
example: P comes up to Q and says, “your money or your life.” Q resists
and beats P up. Had P not said what he did and had he not confronted
Q with a gun, and had Q just beaten P up, Q would have been a bully and
people would have scorned him. But now the consequences of beating
P up are made far more attractive to Q; Q becomes a hero if he does so.
But P hasn’t made Q an offer to beat him up, because though what P did
improved the consequences of some action of Q’s, bringing about this
improvement was not part of P’s reason for acting as he did.45 Has P tipped
Q off to something? Highwaymen normally don’t go on to say, “And if you
resist and beat me up you’ll be a hero,” but if one did, he would have
given his prospective victim a tip, however little appreciated. Note, in¬
cidentally, that one should not think for the case where Q beats P up after P
makes his threat that Q was coerced into doing so, even though Q did so to
avoid the threatened consequence. For not all of the other necessary con¬
ditions for coercion are satisfied.

THREATS, OFFERS, AND CHOICES

I have claimed that normally a person is not coerced into performing


an action if he performs it because someone has offered him something to
do it, though normally he is coerced into performing an action if he does
so because of a threat that has been made against his not doing so. Writers
who count offers as coercive do so, I suspect, because they accept some¬
thing roughly like the following statement:

If Q has available to him the actions in a set A, and as a result of what


P has done or will do

(a) act Ax is significantly higher in utility to Q than the other actions in A

(b) act A2 is significantly lower in utility to Q than the other actions in A

whereas it wasn’t before, and Q

(a) does A, because of this


ROBERT NOZICK 459

(b) refrains from A2 because of this

then Q was coerced into

(a) doing Ax

(b) not doing A2

According to this view, any action of P’s which results in Ai’s being
significantly greater or A2’s being significantly less in utility than the other
actions in A may coerce Q. It makes no difference, according to this
view, how the difference in utility is brought about; whether in (a) Ax is
absolutely raised in utility or all the other members of A are absolutely
lowered in utility, or whether in (b) A2 is absolutely lowered in utility, or
all the other members of A are absolutely raised in utility. It is only the
resulting relative positions, however arrived at, which count. This view is
mistaken, and I shall assume that we can all think of examples that show
to our satisfaction that this is so. I now want to consider whether anything
illuminating can be said about why the notion of coercion isn’t so wide
as to encompass all bringing about of actions by the bringing about of
difference in relative position. The question I’m asking may seem bogus.
After all, there will be some terms which apply both to getting someone
to do something via threats, and to getting someone to do something via
offers, e.g., “getting someone to do something.” And there will be some
terms which apply to one and not the other. Am I just asking why the word
“coercion” is among those that apply to only one of these and not to both?
And why expect the answer, presumably going back to the word’s Latin
roots, to be philosophically interesting? So let me state my task differently.
I would like to make sense of the following claims: when a person does
something because of threats, the will of another is operating or predom¬
inant, whereas when he does something because of offers this is not so;
a person who does something because of threats is subject to the will of
another, whereas a person who acts because of an offer is not; a person who
does something because of threats does not perform a fully voluntary
action, whereas this is normally not the case with someone who does some¬
thing because of offers; when someone does something because of offers
it is his own choice, whereas when he does something because of threats
it is not his own choice but someone else’s, or not fully his own choice, or
someone else has made his choice for him; when a person does something
because of threats he does it unwillingly, whereas this is normally not the
case when someone does something because of offers. (There are other
ways to approach this area. One might ask why we say that we accept
offers, but we go along with threats rather than accept them.)
I would like to make sense of these claims in the face of the following
three roughly true statements, which seem to indicate that threat and offer
situations are on a par so far as whose will operates, whether the act is
fully voluntary, whose choice it is, and so forth.
460 3. THE CONSTRUCTION OF THE GOOD

(1) A person can be gotten to do something which someone else wants him
to do, which he otherwise wouldn’t do, by offers as well as by threats.

(2) A person can choose to do what there is a threat against his doing,
just as a person can choose to do what there is an offer for him not to
do. (“just as”?)
(3) Sometimes a threat is so great that a person cannot reasonably be ex¬
pected not to go along with it, but also sometimes an offer is so great
that a person cannot reasonably be expected not to go along with it
(that is, not to accept it).

I shall consider only a partially described person, whom I shall call the
Rational Man, and unfortunately shall not get to us. The Rational Man,
being able to resist those temptations which he thinks he should resist, will
normally welcome credible offers,46 or at any rate not be unwilling to have
them be made. For he can always decline to accept the offer, and in this
case he is no worse off than he would have been had the offer not been
made. (Here I ignore the “costs” of making decisions, e.g. time spent in
considering an offer.) Why should he be unwilling to be the recipient of an
offer? On the other hand, the Rational Man will normally not welcome
credible threats, will normally be unwilling to be threatened, even if he is
able to resist going along with them. It is worth mentioning some cases
which are or seem to be exceptions to this. A person might not mind threats
if he was going to do the act anyway. But, since in this case he (probably)
wasn’t coerced, he needn’t concern us here. A person might welcome
threats which restrict the acts he can reasonably be expected to perform,
and therefore improve his bargaining position with a third party, e.g. an
employer negotiating with a labor union might welcome publicly known
threats against raising wages by more than n percent. But what he wel¬
comes is not his being coerced into not raising wages by more than n per¬
cent (he is not coerced), but its looking to others as though he is coerced.
This needn’t concern us here.
But there are other cases which are somewhat more difficult. For ex¬
ample, P tells Q that he’ll give Q $10,000 if in the next week someone,
without prompting, threatens Q. Someone does and Q welcomes the threat.
Or, P is jealous of Q’s receiving certain sorts of offers and tells Q that if
he (Q) receives another offer before P does, then P will kill Q. Q cringes
when the next offer comes. Or Q believes that having at least five threats
(offers) made to one in one week brings good (bad) luck, and so is happy
(unhappy) at the coming of the fifth threat (offer) in a week. Or for tax
purposes Q welcomes a threat to illegally take some of his money. And
so forth. I want to say that in such cases when threats are welcomed and
offers shunned, they are done so for extraneous reasons, because of the
special context. (One is tempted to say that in these contexts what would
normally be a threat (offer) isn’t really one.) I find it difficult to distinguish
these special contexts from the others, but the claim that threats are
ROBERT NOZICK 461

normally unwelcome whereas offers are not, is not meant to apply to con¬
texts where some special if-then is believed to obtain, where it is believed to
be the case that if a threat (offer) is made, resisted (accepted), or car¬
ried out then something good (bad) will happen to the recipient of the
threat (offer) (where this good (bad) consequence is not “internal” to the
threat (offer)), and this belief on the part of the recipient of the threat
(offer) overrides other considerations. It is along such lines that I suggest
viewing the person who welcomes a threat because it affords him the
opportunity to prove to others or test for himself his courage. Finally, let
me mention the case where a person is in an n-person prisoners’ dilemma
situation. In this case, he may most prefer everyone else’s being coerced
into performing a dominated action while he is left free to perform the
dominant action. He may also prefer everyone’s being coerced into perform¬
ing a specific dominated action (e.g. paying taxes) to no one’s being so
coerced. And since he realizes that the policy he most prefers, which treats
him specially, isn’t a feasible alternative, he may welcome the threat to
everyone including himself.47 But though he welcomes the system which
threatens everyone, he might still be coerced into performing his particular
action (e.g. paying his taxes). This too seems to me not to be a counter¬
example to the claim that threats are unwelcome, but one of the special
contexts, with special if-thens tagged onto the making of the threat, to
which the claim is not meant to apply.
I have said that the Rational Man would normally be willing to have
credible offers made to him, whereas he would not normally be willing to
be the recipient of credible threats. Imagine that the Rational Man is given
a choice about whether someone else makes him an offer (threatens him).
For example, the Rational Man is asked, “Shall I threaten you (make you
an offer)?” If he answers “yes,” it is done. I am supposing that no offer
is made the Rational Man to say “yes” to this question, and no threat is
made against his not saying “yes”; i.e. that no threats of offers are involved
in this choice about whether a threat (offer) is to be made. Let us call the
situations before a threat or offer is made, the presituation. (I shall speak
of the prethreat and preoffer situations, in anticipation of what is to come.)
And let us call the situations after a threat or offer is made the threat and
the offer situations respectively.
Looking first at offers:

(a) The Rational Man is normally willing to go and would be willing to


choose to go from the preoffer to the offer situation.

(b) In the preoffer situation, the Rational Man is normally willing to do


A if placed in the offer situation.

(c) The Rational Man, in the preoffer situation, is unwilling to do A.


(We’re concerned with the case where he does A [partly] because of
the offer.)
462 3. THE CONSTRUCTION OF THE GOOD

(d) The Rational Man, when placed in the offer situation, does not nor¬
mally prefer being back in the preoffer situation.

Turning to threats:

(a) The Rational Man is normally unwilling to go and unwilling to choose


to go from the prethreat situation to the threat situation.

(b) In the prethreat situation, the Rational Man is normally willing to do


A if placed in the threat situation.

(c) The Rational Man, in the prethreat situation, is unwilling to do A,


and would not choose to do it. (We’re concerned with the case where
he does the act [partly] because of the threat.)

(d) The Rational Man, when placed in the threat situation, would normally
prefer being back in the prethreat situation, and would choose to move
back.

The two significant differences between these two lists are:

(1) The Rational Man would be willing to move and to choose to move
from the preoffer to the offer situation, whereas he would normally
not be willing to move or to choose to move from the prethreat situation
to the threat situation.

(2) The Rational Man, once in the offer situation, would not prefer being
back in the preoffer situation, whereas the Rational Man in the threat
situation would normally prefer being back in the prethreat situation.

If we concentrate solely on the choices made in the threat and offer situa¬
tions, we shall be hard put to find a difference between these situations
which seems to make a difference as to whose will is operating, whose choice
it is, whether the act is fully voluntary, done willingly or unwillingly, and
so forth. If, however, we widen our focus and look not only at the choices
made in the postsituations, but look also at the choice that would be made
about moving from the presituation to the postsituation, then things look
more promising. For now we face not just two choices but two pairs of
choices:

(1) To move from the preoffer to the offer situation, and to do A in the
offer situation

(2) To move from the prethreat to the threat situation, and to do A in the
threat situation.

And the Rational Man would (be willing to) make both choices in (1),
whereas he would not make both choices in (2). This difference in what
choices are or would be made (when other factors are appropriately the
same) seems to me to make the difference, when someone else intentionally
moves you from the presituation to the postsituation, to whose choice it
is, whose will operates, whether the act is willingly or unwillingly done,
and to whether or not the act is fully voluntary.
One would like to formulate a principle that is built upon the preceding
ROBERT NOZICK 463

considerations, but I find it difficult to formulate one that I am confident


is not open to very simple counter-examples. Very hesitantly and tenta¬
tively, I suggest the following plausible-looking principle:

If the alternatives among which Q must choose are intentionally changed by P,


and P made this change in order to get Q to do A, and before the change Q
would not have chosen (and would have been unwilling to choose) to have the
change made (and after it’s made, Q would prefer that it hadn’t been made),
and before the change was made Q wouldn’t have chosen to do A, and after
the change is made Q does A, then Q’s choice to do A is not fully his own.

Notice that I have not said that the feature I am emphasizing which is
mentioned in the principle, namely, being willing to choose to move from
one situation to another, is by itself sufficient for a choice in the latter
situation to be not fully one’s own, but instead I have said that this feature,
in conjunction with the other features listed in the antecedent of the prin¬
ciple, is sufficient.
Since this principle presents a sufficient condition for Q’s choice not being
fully his own, it does not yield the consequence that in the offer situation,
normally Q’s choice is fully his own. A detailed discussion of when choices
are fully one’s own, or fully voluntary, yielding this consequence, would
take us far afield. Here I just wish to suggest that the crucial difference
between acting because of an offer and acting because of a threat vis-a-vis
whose choice it is, etc., is that in one case (the offer case) the Rational
Man is normally willing to move or be moved from the presituation to the
situation itself, whereas in the other case (the threat case) he is not. Put
baldly and too simply, the Rational Man would normally (be willing to)
choose to make the choice among the alternatives facing him in the offer
situation, whereas normally he would not (be willing to) choose to make
the choice among the alternatives facing him in the threat situation.
The principle seems to me to be on the right track in concentrating not
just on the choice of whether or not to do A, but also on the choice to
move into the threat or offer situation. But it is difficult to state a principle,
which gets all the details right, and which is not trivial and unilluminating
(as one would be which said: if P moves Q from Sx to S2 via threats
then . . . ). It seems that rather than speaking (just) of act A being fully
one’s own choice, one should speak of its being fully one’s own choice to
do A rather than B. I have in mind the following sort of case. P intentionally
breaks Q’s leg (intentionally moving him from Si [no broken leg] to S2
[broken leg]). Q would prefer not making this move, and afterwards would
prefer not having made it. But once Q has a broken leg, he chooses to
have a decorated cast put on it, rather than a plain white one. If we just
look at the act of wearing a decorated cast, we will have difficulties, for
surely it is not Q’s own choice (he was forced into a position where he had
to wear a cast, etc.), yet in some sense it is. It seems to me more illumi¬
nating to say that wearing a cast rather than none was not fully Q’s own
464 3. THE CONSTRUCTION OF THE GOOD

choice, wearing a decorated cast rather than a plain one was fully Q’s own
choice, and wearing a decorated cast rather than none was not fully Q’s
own choice. It is not clear how to state a principle which takes this and
similar complications into account, and is not open to obvious difficulties.
I do, however, want to suggest that we shall not be able to understand
why acts done because of threats are not normally fully voluntary, fully
one’s own choice, etc. where as this is not normally the case with acts
done in response to offers, if we attend only to the choice confronting the
person in the threat and offer situations. We must look also at the (hypo¬
thetical) choice of getting (and willingness to get) into the threat and offer
situations themselves.
We have said that if P coerces Q into not doing A then (part of) Q’s
reason for not doing A is to avoid or lessen the likelihood of P’s threatened
consequence. Assuming that all of the conditions in the first section of this
paper are satisfied, then

(a) In the case where Q’s whole reason for not doing A is to avoid or
lessen the likelihood of P’s threatened consequence (ignoring his reasons
for wanting to avoid this consequence), P coerces Q into not doing A.48

(b) In the case where P’s threatened consequence is not part of Q’s reason
for not doing A (even if it is a reason Q has for not doing A) then P
does not coerce Q into not doing A.

But the case is more difficult when P’s threatened consequence is part
of Q’s reason for not doing A, and other reasons which Q has for not
doing A (which do not involve threats) are also part of his reason for
not doing A. For in this case, Q contributes reasons of his own; it is not
solely because of the threat that he refrains from doing A. If we had to
say either that this situation was one of coercion, or was not one of co¬
ercion we would, I think, term it coercion.49 But, I think, for such cases
one is inclined to want to switch from a classificatory notion of coercion
to a quantitative one.50
Let me indulge in a bit of science fiction. Suppose that one were able
to assign weights to the parts of Q’s reasons for not doing A, which indicated
what fraction of Q’s total reason for not doing A any given part was.51
One might then say, if P’s threat was n/mth of Q’s total reason for not
doing A, that Q was n/m-coerced into not doing A. If P’s threat is Q’s
whole reason for not doing A (no part of Q’s reason for not doing A)
then Q is 1-coerced (0-coerced) or, for short, coerced (not coerced). And,
in the absence of precise weights, one might begin to speak of someone’s
being partially coerced, slightly coerced, almost fully coerced into doing
something, and so forth.52 Furthermore, without claiming that a person is
never to be held responsible for an act he was coerced into doing, we might,
for some cases in which his reasons (other than the threat) for doing
an act aren’t sufficient to get him to decide to do the act, be led to speak
ROBERT NOZICK 465

of a person’s being (held) partially responsible for his act; not completely
responsible because he did it partly because of the threat, and not complete
absence of responsibility because he didn’t do it solely because of the
threat, but contributed some reasons of his own. I would end by saying that
the consideration of such a view of responsibility, and the tracing of the
modifications in what has been said thus far introduced by a thorough¬
going use of the notion of n/m-coerced, would require another paper—
were it not for the thought that some readers might take this as a threat.

NOTES

1. An earlier version of part of this paper was read at Columbia University


and at Brown University, and I have benefitted from the ensuing discussions. I
have also benefitted from discussing some of the issues treated here with Professor
Gerald Dworkin.
2. A useful place to begin thinking about unfreedom is with Felix Oppenheim’s
Dimensions of Freedom, which also classify the first three examples as cases of
unfreedom, and the fourth as not. Though I have found this book very illuminating,
I should note that I believe that a correct account of unfreedom will differ sign¬
ificantly from the one it presents.
3. Hart discusses coercion only in passing, and Hart and Honore do not discuss
coercion in any detail, but instead discuss the more general notion of getting
someone to do something. No doubt they would have presented things slightly
differently had they focussed specifically on coercion.
I present their conditions as conditions for coercing someone into not performing
an action. It is obvious how the conditions must be modified to yield an account of
coercing someone into performing an action. In the course of my discussion I
sometimes produce an example as an objection to a condition in the account of
coercing someone into not doing something which is more naturally interpreted as
an objection to the corresponding condition in the account of coercing someone
into doing something. Since once the point of an example is seen, it is easy to think
of another example for the corresponding condition, I present the examples without
regard to whether they apply to a condition under discussion or the corresponding
one.
4. Hart and Honore list one further condition: Q forms the intention of not
doing A only after learning of P’s threat. That Q formed the intention of not
doing A after learning of P’s threat may be reason for thinking that he did A because
of the threat. But Q may have refrained from doing A because of the threat even
though he formed the intention of not doing it before learning of the threat. For
example (this example applies to the corresponding condition), Q intends to visit
a friend tomorrow. P threatens him with death of he doesn’t go. Q then learns that
this friend has a communicable disease such that were it not for the threat, Q
wouldn’t visit him. But Q goes because of P’s threat, though he’d formed the
intention of going before learning of the threat, and never lost this intention.
Though Hart and Honore’s further condition is not satisfied, P coerced Q into going.
5. Or, to handle cases of anonymous threats: Q knows that someone has
threatened to do something mentioned in (1), if he Q, does A.
6. No weight should be placed on the word consequence. Sometimes it will
be more appropriate to say “result”, “effect”, “state of affairs”, “event”, etc. Perhaps
the condition should be formulated by saying “the thing which P has threatened
to bring about.”
7. This condition requires further refinement to handle cases in which unbe¬
knownst to P, Q wants to avoid P’s inflicting the threatened consequence only because
this will lead to some further consequence detrimental to P, which Q (only out
of concern for P’s interest) wants to avoid. For example, Q refrains from A because
he knows that P will feel enormously guilty after he’s inflicted the consequence,
and Q doesn’t want this to happen. Or, Q refrains from A in the face of P’s threat
to fire him only because without Q working for him, P will go bankrupt, and Q
doesn’t want this to happen. I shall not pursue here the details of a principle
which would exclude these as cases of coercion.
466 3. THE CONSTRUCTION OF THE GOOD

8. I included the latter disjunct since I can threaten you with a consequence which
I don’t believe would actually worsen your alternative of doing A, but which I
know that you believe would do so. Subtle questions arise about cases where
it is the making of the threat itself that causes the person to believe that one con¬
sequence is worse than another. For example, a Gestapo agent questioning a
prisoner believes that two concentration camps are equally bad, and the prisoner
too initially believes this. The Gestapo agent tells the prisoner, in a threatening
voice, that he will be sent to a concentration camp in any case, but if he cooperates
during the questioning he will be sent to the first camp, whereas if he does not, he
will be sent to the second camp. Here it is the very making of the threat which
causes the prisoner to think that the second camp is worse than the first.
I might note one refinement of this condition, to handle cases where (part of)
P’s reason for so deciding is as described in the condition, but this part drops away
and P sticks with his decision for another reason entirely and thereafter announces
the decision. It might be more appropriate to say something like: (Part of) P’s
reason, at he time he informs Q he will bring about the consequence or have it
brought about if Q does A, for planning to bring about he consequence or have it
brought about if Q does A, is that P believes. . . .
The condition in the text should be interpreted or extended so as to cover cases
in which the worsening of Q’s alternative of doing A is not part of P’s reason for
deciding to bring about the consequences if Q does A, but rather
(a) P decides to bring about the consequences if Q does A because he believes he
has a duty or obligation to do so.
(b) P knows this consequence would worsen Q’s alternative of doing A.
(c) Part of the reason for P’s bringing about of such a consequence if Q does A
originally being thought to be his duty or obligation, or being continued to be so
thought, is that such a consequence worsens Q’s alternative of doing A.
9. This disjunction is condition 3'. Thus the full condition 3' is: (Part of) P’s
reason for deciding is . . . , or, if P hasn’t decided, (part of) P’s reason for saying is
is. . . . An alternative condition would be just: (Part of) P’s reason for saying is
to get Q not to do A, or to worsen. . . . This alternative condition differs from the
one under consideration for cases where P has decided to bring about the con¬
sequence if Q does A, and no part of his reason or motive is as described, but part
of his reason for telling Q that he will bring about the consequence if Q does A, is
to get Q not to do A. I find it difficult to decide between these conditions, though
I lean towards the one presented in the text. A specific example for which the
condition in the text and the alternative condition diverge is discussed (as case 3)
in the section on Threats and Warnings.
10. Note the difference, with respect to coercion, between saying to a man who
intends to do A in order to bring about x:
(1) If you do A, I’ll do something which (just) prevents your A from bringing
about x.
(2) If you do B, I’ll do something which would, were you to do A, (just) prevent
your A from bringing about x.
11. It was noted earlier that little weight should be put on the word consequence.
Here we are considering cases where one is tempted to say that P does not bring
about any consequence; he just prevents Q from bringing one about.
12. One naturally notes, for many examples of the sort under consideration, that
if Q does not do A and P performs the action which would thwart Q’s A achieving
x, had Q done A (it is not always possible for P to perform this action if Q doesn’t
do A; e.g., “If you mail the letter, I shall intercept it before it reaches him”), then
no bad consequence is visited upon Q. But a condition built upon this observation
would fail on two counts:
(1) It would count Q as coerced when he refrains from doing A because P
threatened to do B if Q does A, where B is such as to just prevent Q’s A
from bringing about x, if Q does A, and to inflict great harm on Q if he
does not do A.
(2) It would count Q as not coerced when he refrains from doing A because P
threatened to bring about a consequence if Q does A, which consequence is
harmful to Q only if he does A.
13. I should note that I do not discuss in this paper, and wish here to leave open,
two further conditions. (Hart mentions something in the area of the first.)
(1) The consequences which P has threatened is so weighted by Q as to override
the weight which Q (morally) ought to give to not doing A.
For example, Q who is not in dire financial condition, and would just slightly
rather not kill people (he feels about killing people as most people do about killing
ROBERT NOZICK 467

flies), kills R because P has threatened not to return the $100 he’s borrowed from
Q unless Q kills R. Did P coerce Q into killing R?
(2) The weight which Q does give to not doing A does not fall far short of the
weight he (morally) ought to give to not doing A.
For example, Q destroys R’s home because P has tnreatened a consequence, if Q
does not destroy the home which R has laboriously built, which Q weights and
anyone (morally) ought to weight as worse than destroying R’s home. However, Q
just slightly would rather not destroy R’s home. Did P coerce Q into destroying
R’s home?
14. If one is reluctant to say that the members of the gang have threatened him,
then a slightly more complicated account of coercion, in terms of threats and im¬
plicit, or quasi-, or surrogate threats, must be offered.
15. Complication: Suppose that as the stranger is being beaten, he says that if
they stop and promise to release him, he’ll sign over a traveller’s check to them for
$1,000. They stop, he signs it over, they release him. Was he coerced into signing it
over?
16. Slightly modifying the conditions previously set out, we obtain, for these
situations:
(1) P performs an action such that, if Q then does A a certain consequence will
ensue.
(2) A with this consequence is substantially less eligible as a course of conduct
for Q, than A without this consequence.
(3) P knows that the act he’s performed satisfies (1) and (2), and intends Q
to know, and know he’s intended to know, that such an act has been per¬
formed.
(4) (Part of) P’s reason for performing his action is that (he believes) its con¬
sequence if Q does A would be believed by Q to worsen his alternative of
doing A.
(5) Q does not do A, and (part of) Q’s reason for not doing A is to avoid or
lessen the likelihood of this consequence.
(6) Q believes that P (or that someone) has done something intending that this
consequence, which he thinks Q will think bad, will ensue if Q does A, and
Q believes that he is intended to know (and intended to know that he is
intended to know) this.
(7) Q believes that, and P believes that Q believes that the consequence of P’s
action if Q does A would leave Q worse off than if Q didn’t do A and P
didn’t do his act.
17. If it is coercion several interesting questions arise. I shall mention only one
which has no obvious analogue about the first kind of coercion discussed. If condi¬
tions (1) — (7) apply to P and Q, and person R, whom P has never thought of (it
is no part of P’s reason for performing the act that it worsens R’s alternative of
doing A) refrains from doing A to avoid the consequence (P’s act is such that
though directed to Q, it would inflict the consequence upon anyone who does A),
was R coerced into not doing A?
Though many questions that arise for this notion correspond to questions about
the first notion, it is not obvious that the corresponding questions about the two
notions must be answered in the same way. In particular, one is more willing, I
think, to call a case of the second sort a case of coercion (assuming that some cases
of this sort are) even though P lacks some of the specified intentions or reasons,
than one is to call a case of the first sort, when P lacks some of the specified in¬
tentions or reasons, a case of coercion.
18. It should be mentioned, in fairness to my mother, that this example was sug¬
gested during the discussion at Columbia by someone whose name I shall not men¬
tion, in fairness to his mother.
It is as a special example of this sort of situation that one might understand the
activities of some charitable organizations which, along with an appeal for funds,
send a “gift,” attempting perhaps to present one with the alternatives of
(a) returning the gift, making no contribution, and feeling slightly embarassed.
(b) keeping the gift while making no contribution, and feeling somewhat guilty
and uneasy,
(c) making a contribution.
19. An interesting question arises for accounts of a notion, such as mine, which
(attempt to) provide necessary and sufficient conditions for the central part or core
of the notion, and then handle further cases by specifying the relations in which
they stand to the core cases. Given a set of conditions, which are purported to be
necessary and sufficient for the core cases of a notion, and given an example to
468 3. THE CONSTRUCTION OF THE GOOD

which the notion applies but which does not satisfy the conditions, how is it to be
decided whether the example is a counter-example requiring the modification of the
conditions, or whether the conditions are to be retained and the example handled as
a non-core case by specifying its relation to cases satisfying the conditions? I would
hope that the reader does not find objectionable my treatment of some particular
cases as cases which should satisfy the core-conditions (in the previous section) and
of some other cases as not being core cases (later in this section), even though the
basis for choosing to treat the particular cases as I do is not stated here.
An alternative procedure to the one followed in the first part of this section is to
accept the previous account as the full account of coercion, and to widen the notion
of what actions a threat is about. (Why this is an alternative procedure will become
clear as the reader comes to the numbered statements which soon follow in the text.
In the notation used in these numbered statements, the alternative procedure would
involve saying that the threat is not only about the act A, but is also about the
B’s.) There seems to me to be some slight reason for the course followed in the
text, but it is not clear that anything very important depends upon which way one
proceeds.
20. Readers who notice my sloppiness in the use of quotation marks here will
know how to remedy it. One may wish to limit the final formulation of such a
principle so that some cases in which P does not know that ri and rz have the same
reference, do not count as P coercing Q into doing B.
21. Note that the consequent of 4 is not equivalent to: P coerces Q into doing Bi
or P coerces Q into doing B2, or . . . , or P coerces Q into doing Bn.
22. One may be reluctant to apply this principle, as it stands, to situations where
unbeknownst to P, Q is specially handicapped so that he can do A only by doing
some horrendous Bi. One must also be careful not to misinterpret conclusions reached
by applying this condition, as in the case (where n = 1) where R advises Q to go to
the movies and P threatens Q with death if he does not go to the movies. Since Q,
whom P coerces into going to the movies, can go to the movies only by doing what
R advised, by applying the condition we reach the conclusion that P coerces Q into
doing the action advised by R, which is easily misinterpreted.
23. A useful question to consider is why the statements obtained from (1) — (8)
by replacing “coerces Q into doing A” by “persuades Q to do A” (making other
obvious changes) are unsatisfactory, where (1) — (8) themselves are not unsatis¬
factory in the same way.
24. Other writers (Laswell and Kaplan, Dahl) do not say that threats are in¬
volved, but claim that inducements, or positive rewards coerce.
25. A more complicated statement would be required to take into account condi¬
tion 7 in the section on Conditions for Coercion. (This condition was prompted by
the example in which P says that he will turn off his hearing aid if Q says another
word.) Since in the present section no examples which violate condition 7 are con¬
sidered, the complication is omitted.
26. P can threaten Q even if the consequence does not worsen the consequences
of Q’s doing the action, so long as P believes it does. Similarly for offering. We
shall not consider this complication, since such threats and offers will normally not
have the appropriate result in actions of Q.
27. I ignore problems about consequences along a continuous dimension, where
there may be no first point of changed preference, or of change from preference to
indifference.
28. The notion discussed here should be distinguished from another in which both
threatening Q with x if he does A and offering him y if he doesn’t do A are said
to predominantly involve an offer (threat) if for almost any action B, if Q is both
threatened with x if he does B and offered y if he does B, Q will prefer to do B
(not do B).
29. I ignore problems arising from a divergence between what P believes to be
the morally expected course of events and what is the morally expected course of
events, e.g., where P believes he’s morally required to let Q drown, although he’s
morally required to save Q.
Consider a further case (after Braithwaite), P and Q are neighbors, and P would
practice his violin each night, whether or not Q is in his own apartment. Q detests
hearing P practice, and asks P to stop. P refuses. The question of monetary com¬
pensation for P’s stopping is raised. Suppose that Q’s property rights are not violated
by P’s practicing, and that $500 is the least amount of money which could get P to
stop practicing for one year, and that $2,000 is the greatest payment Q would make
to P to stop practicing for one year. (Both amounts indicate their real preferences.)
Suppose that P says that he will stop practicing for one year if and only if Q pays
ROBERT NOZICK 469

him $n. One intuitively wants to say that there are some amounts n such that P
would be offering to stop in return for $n, and some (higher) amounts such that it
would be a threat not to stop unless Q paid P the money. The difficulties in de¬
vising a theory of a reasonable, or just, or fair price need no elaboration here. Dis¬
agreements over the range in which a fair price would fall or over whether there is
any coherent notion of a fair price which applies to this situation, may lead to dis¬
agreements about whether we here have a threat or an offer.
30. An alternative view would hold that it is an offer, but that doing something
because of such an offer counts as being coerced into doing it. This would require
modification of our earlier account of coercion to include doing things because of
such offers. (Such offers being offers to not continue the usual though morally for¬
bidden course of events, and to switch (at least temporarily) to the morally expected
course of events which the recipient of the offer would prefer to the usual course
of events.) Readers who find unsatisfactory my here calling the slave owner’s state¬
ment a threat, may call it an offer and treat “threat” appearing in the section on
Conditions for Coercion as a technical term which includes such offers.
31. What if he prefers that he not be supplied with dope, but can’t resist buying
it when it’s available?
32. Let me suggest as a fertile area for testing intuitions and theories, the fol¬
lowing, where the normal and morally expected courses of events may diverge, and
where it may not be clear what the morally expected course of events is. Suppose
some nation N were to announce that it will in the future give economic aid to
some other countries provided that these other countries satisfy certain conditions
(e.g., do not vote contrary to N on important issues before the United Nations, do
not trade with specific nations, do not have diplomatic relations with specific na¬
tions). Would this announcement constitute an offer to give these nations aid, or a
threat not to do so? Relevant factors (to list just two of many) are whether or not
N has an obligation or is morally required to give these nations economic aid (inde¬
pendently of whether they satisfy the conditions), and whether or not N has pre¬
viously given these nations economic aid independently of whether they satisfy the
conditions.
33. Though this seems to me to be the correct thing to say, there is a problem,
which I have not yet been able to solve, which I should briefly mention.
Letting P = you are punished
C = you commit a crime
the offcials in the society might say
P if and only if C
or equivalently
not-P if and only if not-C.
Interpreted truth-functionally, each of these is equivalent to either (P and C) or
(not-P and not-C). The two remaining possibilities are (P and not-C), and (C and
not-P). The background we want to use in deciding whether a threat is involved is
C and not-P. If we were to use the remaining possibility, P and not-C, as the back¬
ground, then it would turn out that an offer is involved here. The problem is to
formulate criteria, in cases where the biconditional is itself part of the normal and
expected course of events, which pick out C and not-P rather than P and not-C in
this and other threat cases, and which would pick out the appropriate background for
offer cases as well.
34. This is a case of blackmail which presumably should be legally forbidden
because allowing it increases the probability that crimes will go undetected. Other
reasons apply to other cases, but note that it is not obvious that one wants to legally
forbid all cases which fit the description: saying that one will make public some in¬
formation about Q unless Q pays money. For example,
(a) P’s saying that he will make public the information that Q has not paid P
the money Q owes him, unless Q pays the money.
(b) P is writing a book, and in the course of his research comes across informa¬
tion about Q which will help sell many copies of the book. P tells Q he will
refrain from including this information in the book if and only if Q pays
him an amount of money equal to the expected difference in his royalties
between the book containing this information and the book without the in¬
formation.
35. A coercive consequence for Q’s A is a consequence which has been threatened
if Q does A.
36. More precisely, the credible threat of raising the probability of this conse¬
quence from what it is without P’s aiding in bringing it about, to what it would be
with P’s aid.
470 3. THE CONSTRUCTION OF THE GOOD

37. Lawyers sometimes speak of the distinction as being between threats and pre¬
dictions. Since philosophers sometimes contrast predictions and statements of in¬
tention, and the latter may be “predictions” in the lawyer’s sense, to avoid confusion,
I speak of threats and nonthreatening warnings. . .
38. Stipulating that each of the members of some particular majority has this pre¬
ference ordering enables us to avoid problems, relevant to our concern here, about
non-transitive majorities. .
39. I consider here cases where the employer could stay in business (without
running at a loss) even if the union wins. In the case where, if the union wins, the
employer cannot both stay in business and out of the red, it is clear that his state¬
ment is a warning and not a threat. (In the normal course of events he does go
out of business if the union wins, and chooses to do so earlier than he must, in order
to cut his losses.) I assume, for the cases discussed in the text, that the employer
making the statement intends to close if the union wins. If he intends not to close, or
has no settled intention either way, then in stating that he will close if the union
wins he is making a threat. .
I should note that I am assuming for some of the cases in which the employer
could profitably stay in business if the union wins, that he does not have an obliga¬
tion to and is not morally required to remain in business if the union wins the elec¬
tion. It may be that some disagreements about whether the employer is threatening
or warning stem from disagreements about whether he is morally required to remain
in business if the union wins (morally required not to close because of dislike of
running a unionized business, etc.).
40. Cf. Schelling, The Strategy of Conflict. Note that according to contemporary
utility theory, it will be reasonable for the employer to rule out (b) for strategic
reasons even if he doesn’t know the preference ranking of the employees, so long as
p u(a) + (1 - p) u(c) > u(b), „ , ,
where p is the probability that the union will lose the election after he s announced
that if they win, he will go out of business, and u(x) is the utility of x to the em¬
ployer.
41. I should note that I have done nothing here to argue, as I would wish to,
that acting on such strategic considerations is not part of the normal or expected
course of events which forms the background to discussing questions of coercion.
Not doing something unless you’d first announced it in this sort of strategic situa¬
tion, for strategic reasons, should be distinguished from not doing something without
prior announcement for other sorts of reasons; cf. discussions of ex post facto laws.
42. I wish here to exclude threats against certain acts which harm others, etc.
It is difficult to determine whether there is a presumption against threatening some¬
one against (or coercing someone into not), e.g., murdering someone else, which is
almost always easily overridden, or whether there is no such presumption in such
cases. It is also difficult to determine exactly what the difference is between these
two alternatives. For an attempt to describe the difference, see my “Moral Complica¬
tions and Moral Structures,” Natural Law Forum, Vol. 13, 1968, section 7,
I do not discuss the possibility that the employer in case 3 is engaging in the
second type of coercion discussed in the section on Conditions for Coercion, so that
his prior act of announcement worsens the employees’ alternative of electing the
union because they will feel worse having the factory close when they have been
warned of this than they would if, without warning, they elected the union and the
factory closed.
43. The Supreme Court held in Textile Workers Union v. Darlington Manu¬
facturing Co. (380 U.S. 263 (1965)) that it is not an unfair labor practice for an
employer to close his entire business, even if the closing is due to antiunion animus,
but that closing part of a business is an unfair labor practice if the purpose is to
discourage unionism in any of the employer’s remaining plants, and if the employer
may reasonably have foreseen such an effect. Given the difficulties in determining an
employer’s purpose, one suspects that the effect of this decision will be to forbid
all employers from closing part of their business because it has become unionized,
if other parts of the business are not unionized.
44. This brief description is meant to indicate the area of concern rather than
as an account of paternalistic legislation. In such an account one would have to dis¬
tinguish this sort of legislation from other legislation, often called paternalistic, which
provides for adults what parents are expected to provide for children, e.g., food,
shelter, money. (I do not claim that no common account of the two sorts of legisla¬
tion can be given.) It is held, by people who call such legislation paternalistic, that
adults are supposed to provide these things for themselves, or through agreements
with other citizens qua private citizens. When legislation and governmental institu-
ROBERT NOZICK 471

tional arrangements provide things which parents provide for children but which
adults are not supposed to provide for (solely by) themselves, e.g. protection from
the infliction of violence by others, such provision is not termed paternalistic.
For the area of concern in the text, since different sorts of reasons can be offered
for the same piece of legislation, one should speak of paternalistic reasons for legis¬
lation rather than of paternalistic legislation. One wants an account of paternalistic
reasons for legislation to have the consequence that some reasons put forth for
legislation which would make people unfree to manufacture or sell cigarettes would
be paternalistic reasons even though they do not involve the protection of the (per¬
haps nonsmoking) persons made unfree to manufacture and sell cigarettes. I shall
not pursue the details here. Note that some paternalistic acts can involve great self-
sacrifice, as when drugs are legally forbidden in order to protect those who are not
addicts who would be so under a system in which drugs were legal. The price we
pay to protect them is increased risk of being robbed or assaulted by addicts trying
to acquire money in order to pay the high prices on the illegal market, plus the
diversion of resources into trying to enforce the law. Perhaps it is appropriate that
we should all suffer for our original unjustified paternalistic intervention.
The reader might find it useful, in thinking about paternalism, to consider
whether there are any limits to the severity of the penalty we would include in a
paternalistic law, and how these limits are to be fixed. Could we, for example, have
the death penalty for the offense of swimming at a beach when no lifeguard is
present? Certain plausible-looking principles would allow this, because when the
system including this penalty is instituted, it is the one of the alternatives which is
expected to best operate for the person’s own good. Surely something has gone hay¬
wire here.
45. Distinguishing two types of offers in a manner similar to the earlier distinc¬
tion between two sorts of coercion, this case does not fit the first type of case. For
here it is not the case that after Q beats P up, P will then do something which im¬
proves the consequences for Q of this action. (Even if P will then do this, e.g. spread
the news that Q is not to be threatened, it is not an offer for the reason mentioned
in the text.) And even though it fits the second sort of case, that is P now does
something (threatens Q) which improves the consequences of Q’s beating him up,
P is not offering something to Q to beat him up, since P lacks the requisite reasons
involved in making an offer.
46. I omit consideration of offers to do acts such that, if the offer is made, the
act cannot be done without accepting the offer, e.g. one cannot work at certain
government jobs without receiving a salary of at least one dollar per year. The
Rational Man may sometimes prefer doing the act without the offer’s having been
made, so that it will be clear to others, and perhaps himself, why he does the act
(e.g. not for the money). I also shall not consider the case of a person’s not wel¬
coming an offer for him to perform a malicious act because of what it shows about
the person making the offer. Note the importance of our restricting our attention
here to the Rational Man. Another person might not, for example, welcome an offer
of $50,000 for him to kill Jones, because he’s afraid he may be tempted to (and
unable to resist the temptation to) accept the offer. In considering only the Rational
Man, I am leaving part of my task undone. For I do not argue, as I would wish to,
that even for someone who sometimes succumbs to temptations which he believes
he ought to resist, there is a significant difference between offers and threats.
47. For a discussion of the prisoners dilemma, cf. Luce and Raiffa, pp. 94—102.
One often finds this argument applied to questions about the provision of a public
good for a group. For example, each inhabitant of an island might prefer that others
contribute to the construction of barriers against the sea while he does not, yet
prefer everyone’s being forced to contribute to contributions being left purely vol¬
untary in which case, let us suppose, the barriers won’t get constructed. (For a dis¬
cussion of the conditions under which a public good for a group will be provided, cf.
Olson. Buchanan and Tullock argue that public goods for a group will be provided
more often than one might think.)
One must be wary of concluding too quickly from this line of argument that
there will be unanimous consent to the provision of the public good by forcing
everyone to contribute. For there will generally be alternative ways in which the
public good can be provided, and individuals even if they all agree that each of
these ways is preferable to the purely voluntary situation, may differ about which
of the ways should be used. Should the good be paid for from funds gathered via a
system of proportional taxation, or one of progressive taxation? And so forth. It is
not obvious how unimous consent to one particular way of providing the good is
suppose to arise.
472 3. THE CONSTRUCTION OF THE GOOD

48. Even if Q has other reasons for not doing A, We distinguish between “Q has
a reason r for not doing A,” and “r is (part of) Q’s reason for not doing A.”
49. This indicates an asymmetry between doing something (partly) because of a
threat, and doing something partly because of an offer. For suppose that the other
reasons Q has for not doing A which are part of his reasons for not doing A, in¬
clude an offer by R for Q not to do A. Using a classificatory notion of coercion,
doing A partly because of a threat shows the person was coerced, whereas doing
something partly because of an offer does not show that he was not coerced.
50. For a discussion of classificatory, comparative, and quantitative concepts cf.
Hempel, Part III. An illuminating study of different scales of measurement is Suppes
and Zinnes.
51. A good place to begin in thinking about the weight of reasons is with Ernest
Nagel’s discussion of the weight of different causes in The Structure of Science,
pp. 582-588.
52. There are other factors which one might wish to build into a quantitative
concept or measure of coercion, though perhaps there is no natural way to combine
all of the factors into one measure. For a discussion of some similar questions about
measuring freedom cf. Oppenheim, ch. 8.

REFERENCES

Bay, Christian. The Structure of Freedom. Stanford University Press, 1958.


Buchanan, James M., and Tullock, Gordon. The Calculus of Consent.
University of Michigan Press, 1962.
Dahl, Robert. Modern Political Analysis. Englewood Cliffs, New Jersey,
Prentice-Hall, 1954.
Hale, Robert L. Freedom Through Law. Columbia University Press, 1952.
Hart, Herbert L. A. The Concept of Law. Oxford at the Clarendon Press,
1961.
Hart, Herbert L. A., and Honore, A. M. Causation in the Law. Oxford at
the Clarendon Press, 1959.
Hempel, Carl G. Fundamentals of Concept Formation in Empirical Sci¬
ence, University of Chicago Press, 1951.
Laswell, Harold, and Kaplan, Abraham. Power and Society. Yale Univer¬
sity Press, 1950.
Luce, R. Duncan, and Raiffa, Howard. Games and Decisions. New York,
John Wiley and Sons, 1957.
Nagel, Ernest. The Structure of Science, New York, Harcourt, Brace, and
World, 1961.
Olson, Mancur. The Logic of Collective Action. Harvard University Press
1965.
Oppenheim, Felix. Dimensions of Freedom. New York, St. Martin’s Press
1961.
Schelling, Thomas. The Strategy of Conflict. Harvard University Press
1960.
Suppes, Patrick and Zinnes, J. L. “Basic Measurement Theory” in R. D.
Luce, R. Bush, and E. Galenter, eds., Handbook of Mathematical Psy¬
chology, Vol. I. New York, John Wiley and Sons, 1963.
EXISTENTIALISM AND DEATH:A Survey
of Some Confusions and Absurdities1
Paul Edwards

This paper is not meant to be an exhaustive discussion of existentialist


pronouncements about death. Some, like the curious notion that life is
“essentially being towards death,” are not dealt with at all and others,
like the view that an “authentic” mode of life is possible only for a person
who “resolutely confronts death,” are no more than mentioned in passing.
My aim has been to cover those existentialist doctrines which are tied, in
one form or another, to confused ways of thinking about death common
among people in general and which occur independently of the efforts of
the existentialists.

DEATH AS SLEEP IN THE GRAVE

Most human beings, whether they are religious believers or not, appear
at times to have great difficulty in regarding death as truly and really the
absence of life. In some contexts they do treat death in this way, but at other
times they think of it as a restful or gloomy or undesirable continuation of
life. There is a very common tendency to think of a dead person as sleep¬
ing an extremely deep sleep in his grave—so deep that he will never
again wake up. A famous Italian conductor was once greatly upset by the
way the musicians of the New York Philharmonic were playing the move¬
ment of a Brahms symphony at a rehearsal. “If Brahms were alive,” he
finally exclaimed in exasperation, “he would be turning in his grave.”
When this story is told, it usually takes some time before people see the
absurdity of the conductor’s remark. If Brahms were alive he presumably
would find better things to do than lie in a grave.2 However, to a person
vaguely thinking of Brahms as sleeping in his grave, the conductor’s remark
will not seem absurd.
People do not have this difficulty in the case of other absences. If a
whiskey bottle is empty, nobody is likely to maintain that it is filled with
an ethereal liquid; and if one comes across a blank canvas, one is not
tempted to describe it as an exceptionally abstract painting. Yet, this is
precisely how we frequently think of death. We then refer to it more or less
seriously as “the rest which may not be unwelcome after weariness has
473
474 3. THE CONSTRUCTION OF THE GOOD

been increasing in old age” (Bertrand Russell), as “quiet consummation”


(Shakespeare), or perhaps as “the cool night” which follows the hot and
busy day (Heine). We also think of it as a place to which we “pass on”
or depart (at the end of our “journey”), as “the harbor to which sooner
or later we must head and which we can never refuse to enter” (Seneca),
as “the undiscover’d country from whose bourn no traveller returns”
(Shakespeare); and we tend to regard this place as dark and perhaps even
terrifying, as “eternal night” (Swinburne), “a beach of darkness . . . where
there’ll be time enough to sleep” (A. E. Housman), the “engulfing im¬
penetrable dark” (H. L. Mencken). It is not uncommon to speak of this
place as the same one which we left when we were born. Schopenhauer
speaks of birth as the “awakening out of the night of unconsciousness”3
and he wavers between regarding our return to this state of unconsciousness
as something to be welcomed and something to be dreaded. On the one
hand he writes that the “heart of man rebels” against having to return to
nonexistence; on the other he claims to be speaking for suffering mankind
who would much rather have been “left in the peace of the all-sufficient
nothing” where their days were not spent in pain or misery (op. cit., p.
389). Darrow, who shared the letter of there sentiments, spoke of life as
“an unpleasant interruption of nothingness.” “Not to be bom is the most
to be desired,” in the words of Sophocles, “but having seen the light, the
next best thing is to go whence one came as soon as may be.” Pliny, who
ridicules any belief in survival as the logically baseless “fancy” of human
vanity, accuses the believers of robbing mankind of “future tranquillity.”
“What repose,” he exclaims, “are the generations ever to have,” if they
cannot be “from the last day onward in the same state as they were before
their first day.” Seneca, too, thought it fortunate that a person could always,
by a voluntary act, “escape into safety.” Advocating suicide in certain
situations, he asks, “Do you like life? Then live on. Do you dislike it?
Then you are free to return to the place you came from.” At death, Seneca
writes in another place, “you are brought back to your source.” A lamp,
he also observes, is no “worse off when it is extinguished than before it
was lighted,” and in the same way “we mortals are also lighted and ex¬
tinguished; the period of suffering comes in between, on either side there
is a deep peace.” But not all writers who regard death as a “homecoming”
think of the place to which we return as a restful abode. Thus James
Baldwin, the novelist, admonishes us to negotiate the “passage” of life as
nobly as possible—in this way we will obtain “a small beacon in that ter¬
rifying darkness from which we came and to which we shall return.”
This tendency to think of death as a shadowy and, especially, a very pain¬
ful and undesirable form of existence is reinforced by the way in which we
place death at or near one end of the scale of our punishments and ill¬
nesses. Just as two years of imprisonment are more undesirable than one
year and life imprisonment is worse than either, being sentenced to death
PAUL EDWARDS 475

is regarded by most people as a worse fate yet; and even those who con¬
sider life imprisonment worse than death regard the latter as very un¬
desirable—at least as undesirable as, say, imprisonment for ten years.
Again, just as we regard a chronic illness involving some pain as worse
than a merely temporary ailment involving the same degree of pain, so we
regard a mortal illness, because it is mortal, as worse than either; and
although many people would regard some chronic (nonfatal) illnesses as
“objectively” worse than death, almost everybody treats mortal illnesses
as (necessarily) very undesirable, even if the amount of pain involved is
relatively slight. Since languishing in jail or suffering a painful illness are
states or processes of living organisms, it becomes tempting to regard death
as another, very undesirable, state of a living organism. We see, in the words
of P. L. Landsberg, a philosopher writing in the phenomenological tradi¬
tion, that “death . . . must exceed all experiences of illness, suffering or
old age.”4
Another line of reflection that may lead to a similar conclusion is sug¬
gested by Landsberg in the course of discussing the “community” that two
people may form—a husband and wife, for example, who not only love
each other but who have braved many a storm together. If one of them dies,
this “community,” this “we,” is destroyed. The surviving person experi¬
ences then a “bitter cold.” In feeling the death of the “we,” he is led into
an “experiential knowledge” of his own mortality. “My community with
this person,” writes Landsberg, “seemed shattered, but the community was
to some degree myself; and to this degree I experienced death in the very
core of my own existence” {op. cit., pp. 14-16). It is tempting to proceed
to the conclusion (though Landsberg in fact does not explicitly go that
far) that one’s own death is a more extreme instance of the same kind of
thing: even more bitter and cold than the bitter cold which the survivor
experiences upon the death of the “we.”

FEAR, ANXIETY, AND DEATH

This common human tendency to regard death not as just the absence
of life but as existence in a dark, impenetrable abode has been enshrined
into a philosophical doctrine by the Christian existentialist, the late Professor
Paul Tillich, in his “ontology” of Non-Being or Nothingness. Tillich’s doc¬
trine is introduced in connection with his distinction between fear and
anxiety (it should be noted that although Tillich’s use of these expressions
is in harmony with that of other existentialists, it is significantly different
from their use by most professional psychologists and psychiatrists). In
fear, writes Tillich, we are always facing a definite object: It may be physical
pain, the loss of a friend, the rejection by a person or a group or any
number of other things, but in each case it is something “that can be faced,
476 3. THE CONSTRUCTION OF THE GOOD

analyzed, attacked, endured,” and met by courage.5 In anxiety, on the other


hand, the object, if it can be called an object, is “ultimate nonbeing”; the
“threat” here is not due to something specific like physical pain but to
nothingness. Unlike fear, anxiety cannot be met by courage and it is almost
unendurable. “It is impossible for a finite being,” in Tillich’s words, “to
stand naked anxiety for more than a flash of time. People who have ex¬
perienced these moments, as for instance some mystics in their visions of
the ‘night of the soul,’ . . . have told of the unimaginable horror of it”
(CB, p. 39). Although fear and anxiety must not be confused with one an¬
other, they are closely related. Among other things, there is an element
of anxiety in every fear and it is this element of anxiety which gives the
fear its “sting.”
Tillich applies his distinction between fear and anxiety to the “outstand¬
ing example,” namely, the fear of death. There are two elements in this
fear—fear proper which has an object like an accident or a mortal illness
and anxiety whose “object is the absolutely unknown ‘after-death,’ the
nonbeing which remains nonbeing even if it is filled with images of our
present experience” {op. cit., p. 38). Tillich is very concerned that his use
of the word “unknown” should not be misunderstood. It is not any un¬
known but the absolutely unknown that one faces in this “basic anxiety”
of one’s “ultimate nonbeing.” There are “innumerable realms of the un¬
known” that are faced with fear but without any anxiety. Here Tillich
probably has in mind the kind of thing that happens when a person is afraid
of a new job in which he has to perform unfamiliar tasks or when an ex¬
plorer is approaching territory about which no reports are extant. These
unknowns are not in principle unknowable. The situation is altogether dif¬
ferent in the case of the unknown “which is met with in anxiety.” It is an
“unknown of a special type,” which “by its very nature cannot be known,
because it is nonbeing” (p. 37). Elsewhere, in discussing man’s finitude,
Tillich observes that since man is “created out of nothing,” he must “return
to nothing.” Very much like Seneca and Pliny, he tells us that nonbeing
“appears as the ‘not yet’ of being and also as the ‘no more’ of being.” Like
all other finite entities, human beings, while alive, are “in process of coming
from and going toward nonbeing.”6 Somebody who accepts this account
would presumably hold that while Shakespeare was not far wrong when he
spoke of our ultimate nonbeing as the “undiscover’d country from whose
bourn no traveller returns,” it would have been more accurate to speak of
an “undiscoverable country.” Mencken was closer to the truth (as Tillich
sees it) when he spoke of our death as the “impenetrable dark” that must
eventually engulf us. Tillich himself indeed uses practically the same words
in one place: “We come from the darkness of the ‘not yet’,” he writes, “and
rush ahead towards the darkness of the ‘no more’.” Our “unavoidable end”
is “impenetrable darkness.”7
PAUL EDWARDS 477

THE SEARCH FOR THE “ONTOLOGICAL CHARACTER” OF


DEATH

Perhaps it would not be inappropriate to label Tillich an “agnostic on-


tologist.” He is an ontologist in the sense that he regards death as not merely
the absence of life but as a state toward which all human beings inevitably
“rush”; and he is an agnostic in that he regards death as an unknowable
state. Other existentialists, who share Tillich’s view that death is a state, do
not agree with him that it is entirely unknowable. Prominent among those
who believe that human beings can, by suitable “existential” or “dialectical”
techniques, achieve some knowledge about the nature of death are Professor
John Macquarrie, the eminent Protestant theologian, co-translator of Hei¬
degger’s Sein und Zeit, and author of numerous influential works,8 and the
Spanish philosopher, Professor Jose Ferrater Mora, renowned for his monu¬
mental Diccionario de Filosofia, and author of Being and Death? a work
expounding a “general ontology” in which an attempt is made to “integrate”
the achievements of the existentialists with the insights of the naturalists.
Neither Professor Macquarrie nor Professor Mora would deny that there are
grave difficulties in the way of discovering what death is, but they appear to
believe that these difficulties may, to a certain extent, be overcome. We
definitely need not, in Mora’s words, “resign ourselves to saying nothing
about death” (BD, p. 177).
Both Macquarrie and Mora engage very actively in what we may call
“the ontological quest” or the search for the “ontological character” of
death. To explain what this quest is or rather what these (and various other)
writers believe themselves to be doing, let us first note certain explicit dis¬
claimers on their part. Following Heidegger, both Macquarrie and Mora
regard death as more than a mere “natural happening”—as something more
than could in principle be explored by the use of scientific methods. Thus,
in asking the question “What is death?” or “What is the nature of death?”
these philosophers are emphatic that they are not asking the kind of question
that a physiologist would ask when he inquiries into the nature of death.
The ontologists are also not concerned with the traditional religious ques¬
tion of whether human beings live on after the death of their bodies. Nor
are they concerned with such “metaphysical” questions as “how and why
death came into the world?” Heidegger and the various ontologists writing
under his influence do not dismiss this last question or the question concern¬
ing survival as meaningless, but they insist that their ontological quest is
more fundamental and ought to be dealt with first. Both the religious and
the metaphysical question, in Macquarrie’s words, presuppose “an onto¬
logical understanding of death” (ET, p. 117). We cannot hope to answer
478 3. THE CONSTRUCTION OF THE GOOD

or even understand such questions until we have “clarified” the ontological


nature of death {ibid.), until “the character of death . . . has been fully ex¬
plored” (SCE, p. 50). These questions can be intelligently approached only
after we have “grasped the existential phenomenon of death” {ibid.).
All of this tells us what the ontological quest is not. We can, I think, see
what the ontological quest is or what it is supposed to be by first mentioning
certain “difficulties” which our ontological explorers freely acknowledge.
We cannot find out what death is by any straightforward employment of
experience or of the “phenomenological method.” “Death,” writes Mac-
quarrie,

is to be investigated by the same method of phenomenological analysis that


Heidegger employs in the rest of the existential analytic, [but] there are clearly
difficulties here that do not attend any of the other phenomena analyzed.
Understanding, moods, speech, anxiety, concern, solicitude—these are all
phenomena of existence that undoubtedly go to constitute our daily living. We
know them from experience and from continuous participation in them. . . .
All this is possible because our experience of these matters is a “living through”
them, so that we are then able to reflect upon them and describe them [SCE,
p. 51].

Unfortunately death is not like anxiety, concern, or solicitude: The dead


person, since he is no longer alive, does not experience his death and hence
the phenomenological method cannot be employed by him to study his
death. In Professor Macquarrie’s words:

Anyone who undergoes death seems by that very fact to be robbed of any
possibility of understanding and analyzing what it was to undergo death [ibid].

The dead man’s

being is no longer lit up to himself in the only way that would seem to make
anything like an existential analysis possible, and so it appears that he cannot
by any means understand what the undergoing of death may be like [ibid].

Macquarrie does not abandon the search after these admissions. He at¬
tempts to get at the nature of death by a consideration of various “analogies”
and by reflections about the death of others. Although he is very emphatic
that the usefulness of these inquiries is limited, Macquarrie believes that
they lead to a “preliminary understanding” of the nature of death. Perhaps,
he asks, it is possible to compare death “to the ripeness of a fruit, which is
not something added to the fruit in its immaturity, but means the fruit itself
in a specific way of being” (ET, p. 118). This analogy, unfortunately,
breaks down at the crucial point. For “whereas ripeness is the fulfillment of
the fruit, the end may come for man when he is still immature or it may
delay until he is broken down and exhausted with his fulfillment long past”
{ibid.). Although this analogy breaks down (and the same is true of others
which I have not reproduced), Macquarrie believes that such considerations
yield a “positive result.” It becomes clear that “death belongs to my pos-
PAUL EDWARDS 479

sible ways of being—though in a unique kind of way, since it is the possi¬


bility of ceasing to be. It is already a possibility present in existing. ... it
shares a fundamental character of existence, and as a present possibility it
is disclosed to me and can be analyzed” (ibid.).
“May information be obtained from considering the death of others?”
(ET, p. 117). We cannot phenomenologically study our own death since
we shall not be able to do any studying when we are dead, but perhaps we
can get at the ontological character of death by paying careful attention to
what happens when others die while we are yet alive to witness their deaths.
As we mentioned previously, Macquarrie does not believe that such an
inquiry is entirely fruitless, but at the same time he admits that it does not
yield anything like a full answer to his original question. However, in the
course of this admission he makes some very revealing remarks. He points
out that when we study the death of others our phenomenological explora¬
tion is really confined to the mental states of the survivors. Our “vicarious
experience” of the death of others cannot be adequate for “grasping death
as an existential phenomenon” (SCE, p. 52). “The death of others is ex¬
perienced as the loss sustained by those who remain behind, and not as the
loss of being which the deceased himself has sustained” (ET, p. 118, my
italics). Nor is this the only trouble. For, in addition to the fact that what we
experience is our loss and not the loss sustained by the dead person, the
latter cannot communicate to us about the loss he has sustained. He cannot
“any longer communicate with us to describe that loss of being” (ibid.).
I think we can now rephrase the ontological question as Professor Mac¬
quarrie conceives it in the following ways:

What is death like as it is to the dead?

What is the nature of the loss sustained not by the survivor but by the
deceased?

How does the loss of being sustained by the dead person feel to him
(not to us) or, since he feels nothing any more, how would it feel to
him if he could feel it?

These questions may sound slightly mad, but they are a precise formulation
of the ontological quest as conceived by Professor Macquarrie and, in vary¬
ing degrees, by a number of other existentialist explorers as well.
Before leaving Macquarrie, we should note that in his opinion the study
of the death of others yields an important positive result. “One positive
character of death” has been ascertained: “Death is always my own since it
cannot be experienced vicariously. ... it is untransferable and isolates the
individual. He must die himself alone” (ET, p. 118). There are innumerable
ways in which one person can represent another, “but nothing of the kind is
possible in the case of death. ... no one can die for another, in the sense of
taking the other’s dying away from him and performing his death for him”
(SCE, p. 52). This result may be “combined” with the one achieved in
480 3. THE CONSTRUCTION OF THE GOOD

the course of the analogical inquiries mentioned earlier. Together, these


results amount to “a preliminary understanding of death as an existential
phenomenon.” This preliminary understanding may be expressed by saying
that “death appears as my own present untransferable possibility of being
no longer in the world” (ET, p. 118). In other words, “death belongs to
man’s possibility—it is, indeed, his most intimate and isolated possibility,
always his own” (ET, p. 119).
Like Professor Macquarrie, Professor Mora is much perturbed by the
difficulties in the way of a phenomenological exploration of the nature of
death. Although, he writes, “we know that there is such a thing or such an
event as death, that death is inevitable, that we all must die, and so on, we
still do not realize in full measure what death is and what it means until
we somehow ‘experience’ death” (BD, pp. 175-176). But just such an
experience seems to be excluded by the very nature of death.
We can “see” that people die; we can think of our own death as an event
which will take place sooner or later, but we do not seem to be able to ex¬
perience death in the same way as we do other “events” such as pleasure, pain,
good health, illness, senility. All we can “see” of death is its “residue,” for
example, a corpse . . . [ibid.]

It should be noted that a dead person is here automatically regarded as more


than a corpse, and it is of course this more which Professor Mora is trying
to explore.
Mora agrees with Macquarrie that we cannot get a clear view of the
nature of death, but he maintains that we can at least get some kind of
glimpse. Although we cannot ever attain a “direct and complete grasp of
the nature of death” (p. 178), our experience furnishes us with data that
may serve as the basis for “drawing some inferences” (ibid.). Mora’s object
is to get at the inside of death and he thinks he can, to some slight extent,
attain this goal by studying the attitudes which people display toward death.
“A description and analysis of some typical attitudes regarding death can
cast some light on our subject” (pp. 192-193). It is true that in studying
these attitudes we do not experience our death “exactly in the same sense
in which we can experience love, friendship, sorrow, and so on” (p. 192),
but in our investigation of attitudes toward death “we can place ourselves,
so to speak, in front of it (or its possibility).”10 Professor Mora then surveys
different attitudes displayed by people on the point of dying—those who
faced a firing squad but were reprieved at the last moment and others who
appeared to be drowning but were rescued before it was too late. After
enumerating the different kinds of feelings and thoughts that may be going
on in people “immediately preceding impending death,” Professor Mora
does not hide his disappointment and concedes that the value of such a
survey is severely limited as far as the purpose of his ontological inquiry is
concerned. It must be granted that in attending to our and other people’s
attitudes, we “see our death” only “somehow from the outside” (p. 194).
PAUL EDWARDS 481

This is not as much as one could wish, but it is considerably more than
nothing—“ ‘somehow from the outside’ is not the same as ‘completely from
the outside.’ In some respects we are looking at our death (or its possibility)
from the inside', otherwise we could not even take ‘an attitude’ in front of
our death or its possibility” (p. 194, my italics).
Like Macquarrie, Mora pays much attention to the death of others, but
he is a little more sanguine in his confidence that such a study can get us to
the inside of death. In the absence of a “direct and complete grasp,” we
can at least “use analogy and conceive of our death in terms of another’s
death” (p. 192). Professor Mora recounts three personal experiences which
“are to be taken as examples of another’s death. They cover ‘cases’ which,
as happens in legal matters, can be considered ‘precedents’ ” (p. 178). We
shall here confine ourselves to the two which Mora himself regards as his
more hopeful cases. In one of them he witnessed the sudden death of a
man killed by a bullet in the course of a battle. Professor Mora had not
known this man at all and although he felt the death of this man to be
symbolic of “the universal and overwhelming presence of death,” he ex¬
perienced neither grief nor anguish. What happened was a “mere fact,”
something merely objective, “outside there” (p. 183). The second case
deals with Professor Mora’s maternal grandmother. Here the person who
died was not a stranger but on the contrary was somebody whom Professor
Mora had known exceedingly well and with whom he had formed “a com¬
munity of participation” somewhat along the lines described by Landsberg.11
If the death of a given person is, in relation to a survivor, a “purely external
event” then, Professor Mora believes, the survivor would not be justified in
claiming that he had experienced the person’s death “in the sense of some¬
how sharing it.” The death of his grandmother, however, was not experi¬
enced by Professor Mora as a merely external event. In such a case “we
are not merely ‘watching’ someone die but we are, or are also, ‘sharing’ his
death—at least to the degree in which we had ‘shared things in common’.”
However, we must not allow ourselves to be carried away and claim too
much. Even when the death is not a merely external event, one only “some¬
how” shares the deceased’s death—“to conclude . . . that we are actually
‘sharing’ another’s death,” even when the person was terribly close to us,
“would be to go too far” (p. 179, my italics). When all is said and done,
Professor Mora concludes, “I knew little about the relation between my
grandmother and her death, and still less about the relation between the man
shot down in battle and his death” (p. 185, Mora’s italics).

DEATH IS NOT A STATE

What is a person who has preserved his sanity to say to all this, more
especially to the search for the ontological character of death, the “inside
482 3. THE CONSTRUCTION OF THE GOOD

nature” of death as it is to the dead, the nature of the loss sustained not by
the survivors but by the deceased, death not as it is observable when we see
a dead body, but as it is “undergone” by the dead person? Perhaps the best
way to call attention to the ludicrous confusion underlying all such onto¬
logical searches is to relate the following conversation between two German
pessimists.12 “It is much better to be dead than to be alive,” said the first.
“You are right,” remarked the second, “but it is still better not to have been
born in the first place.” “That,” replied the first, “is very true, but alas how
few are those who achieve such a happy state.” Since he regards death as a
loss and not as a gain we may, in this context at least, regard Professor Mac-
quarrie as an optimist and we may imagine an optimist who shares his onto¬
logical views reasoning in the following way: “A man who loses both his
arms sustains a greater loss than one who loses one arm only, and a man
who loses his eyes and his arms sustains a still greater loss. A yet greater loss
is sustained by him who loses his life. Even he, however, is not quite as badly
off as the man who failed to be born in the first place. The lot of the latter
is the worst of all. It is very fortunate that there are not too many who find
themselves in this dreadful condition.” To diagnose as clearly as possible the
absurdity in the procedures of the German pessimists as well as the Mac-
quarrian optimist let us first, following Benn and Peters,13 distinguish be¬
tween the “actions” a person performs and the “passions” he experiences or
undergoes. An action is anything a person does, for example, singing a
song, giving a lecture, assaulting an enemy, resigning a position. A “passion,”
in the broad sense in which Benn and Peters use the word, is anything that
happens to a man—a toothache, the tortures he endures, the pleasure he
experiences when drinking a glass of orange juice after a game of tennis,
the feelings of constrictions he has when gagged or confined to a prison cell.
No doubt the distinction is far from sharp, but it is one which all of us make
in certain situations. Now, Macquarrie, Tillich, Mora, and most of the poets
and philosophers mentioned in the opening section of this article recognize
that the death of an individual is not an action, but they mistakenly believe
or imply that it is some kind of passion, though a very special and extremely
passive type of passion. In fact, however, neither death nor our nonexistence
before we were born are passions any more than they are actions. If we
introduce the word “state” to mean any action or passion, then we can
express our point by saying that, while feeling the coolness of the night,
reaching the last stop of a journey, arriving in a strange country from which
one will never return, sleep, and rest, sustaining losses (no matter how
serious), undergoing pain and torture, feeling isolated and all alone and
even finding oneself surrounded by impenetrable darkness are states or
experiences of living human beings; death is not a state. At times, indeed,
the ontological explorers themselves realize this, for example when they
complain about the difficulties of a phenomenological investigation of death.
At other times, however, they seriously believe that death is a state, a dark
PAUL EDWARDS 483

and wholly or largely inaccessible one, to be sure, but a state nevertheless.


Without such an assumption they would have to admit that death is simply
the absence of life and there would be nothing to explore. It should be
added that these strictures do not apply to those who, when asking such
questions as “What is death?” or “What are we like after death?” thereby
raise the issue of survival. However, the existential ontologists whose ex¬
plorations we are discussing either do not believe in survival or else ex¬
plicitly stress that their ontological questions are not questions about
whether we survive the death of our bodies.
The linguistic form of the sentences which we use to assert that a person
is dead is similar to that of the sentences which are used to ascribe states to
individuals. This similarity makes it tempting to suppose that the former
sentences are also used to make state-ascriptions, but a little reflection is
sufficient to show that the kind of analysis which will work for state-ascrip¬
tions does not make any sense in the case of statements asserting that some¬
body is dead. Let us briefly look at the following three statements:

(1) A is performing in “Don Giovanni” at the moment.

(2) Tomorrow A will be in one of his gloomy moods.

(3) A year from now A will be dead.

If we go by linguistic appearances alone, we are inclined to say that (3) no


less than (1) and (2) are about A, and it is also tempting to believe that in
each case we are attributing or ascribing a certain state or experience to A—
in (1) an active state, in (2) one that is fairly passive, and in (3) an ex¬
tremely passive one. In a sense no doubt all three statements are about A—
in the sense that we are asserting some fact about A rather than about other
people—B, C, etc. In another sense, however, (1) and (2) are about A
while (3) is not. In (1) and (2) we are ascribing states to A, and we pre¬
suppose that A is or will be alive at the times in question. In (3) on the
other hand we are not ascribing an extremely passive state to A: We are
denying what is presupposed in all state-ascriptions. (1) can be expanded
into “A is alive and is performing in Don Giovanni now”; (2) into “A will
be alive tomorrow and will be in a gloomy state”; but (3) cannot be ex¬
panded into “A will be alive one year from now but he will then be in the
extremely passive state of deadness.” Yet those engaged in the ontological
quest treat (3) as if this were the proper analysis.

THE MADNESS OF THE ONTOLOGICAL QUEST

Once death is treated as a state, it is very natural to reach Tillich s con¬


clusion that it is something absolutely unknowable. It is then quite natural
to reason along the following lines: I am now alive; I am not yet dead,
484 3. THE CONSTRUCTION OF THE GOOD

hence I cannot now know from personal experience what the state of being
dead is like. But this state is different from other unknowns. It is a very
special unknown. Africa is also unknown to me, but others who have been
there can tell me about it when they return. Again, I have never been
skating, but other people can tell me what it feels like to glide across a frozen
lake. Nobody, on the other hand, can tell me what death is like. For one
thing, nobody can come back from the dead to tell me; but, furthermore,
even if somebody did come back, this would not help, since while he was
dead he would have had no experiences and could not attend to his own
state of deadness. The conclusion thus seems inescapable that, as Tillich
so happily put it, death “is the unknown which by its very nature cannot be
known.”
In arguing that death is a totally unknowable state, Tillich dimly perceived
something which ontologists like Macquarrie and Mora obscure when they
assert that they have some little knowledge of the nature of death. Tillich
dimly perceived that it is logically impossible to attain the object of the
ontological quest. The ontologists write in such a way as to suggest that
they are trying to determine the characteristics of a peculiarly elusive state,
but a little reflection makes it clear beyond any question that what we have
here is a series of self-contradictory expressions and not any kind of state,
elusive or otherwise. To an uncritical reader it may appear—and the re¬
marks of writers like Macquarrie and Mora are specially apt to foster this
impression—that the object of the ontological quest is a state which cannot
in fact be examined by human beings because the only subjects competent
to examine it are chronically absent when they are needed for the examina¬
tion. It may thus be thought that the relation of human beings to the object
of the ontological search is like their relation to some territory which is so
extremely hot or so extremely cold that anybody wishing to explore it is
annihilated before he can get to his destination. In fact, however, the situ¬
ation is altogether different. The ontologists are wondering what death would
feel like to the dead if they could attend to their deadness, but part of what
is meant by saying that a person is dead is that he no longer has feelings or
experiences. The ontological search thus amounts to the question “How does
it feel to be in a state in which one no longer has any feelings?” or “What
kind of an experience does a person have who no longer has any experi¬
ences?” These questions are not one whit more sensible than such absurd
questions as “How long is the fourth side of that triangle?” asked by some¬
body who is pointing to a perfectly ordinary triangle or “In which country
is the father of this orphan living now?” where the questioner is not referring
to any foster father or to any habitat in the next world. The ontological
questions do not become any less grotesque by being expressed hypotheti¬
cally. “How would Hume’s death feel to him if he could attend to it?” is
not any less ludicrously absurd than “How does Hume’s death appear to
him?” or “How does (the dead) Hume feel about his death?” Once again
PAUL EDWARDS 485

the reader should be reminded that the ontological explorers have ruled out
questions about survival as irrelevant to their problem.
In a recent decision in which he enjoined the American Nazi Party from
holding parades within two miles of Jewish houses of worship, a Chicago
judge observed that he would similarly issue an injunction against a group of
nudists if they wished to parade in their native attire outside a Presbyterian
church. Puzzled by the nature of native attires, an ontologist might now
engage in the following investigation: To wear one’s native attire is to wear
very peculiar clothes. What kind of clothes is a person wearing who is wear¬
ing his native attire? There are serious difficulties in the employment of the
phenomenological method in this case. When a man is wearing a hat or
a woman wearing a skirt and blouse we can perceive the clothes they are
wearing. However, when we look at somebody who is wearing his native
attire we cannot perceive any clothes. If we could perceive clothes on the
person, he would not be wearing his native attire. At this stage a Tillichian
ontologist would maintain that we must reach an agnostic conclusion—
native attires consist of unknowable clothes—while somebody following
Macquarrie and Mora would try to attain a “little knowledge” perhaps by a
careful study of the clothes which the nudists wear when they are not wear¬
ing their native attire or by studying people who are in the process of
changing from their work clothes into their native attire. Perhaps analogies
might yield helpful clues—perhaps we should study oranges and apples and
bananas after they have been peeled or perhaps an examination of trees,
denuded of their foliage, may yield at least a preliminary understanding. The
ontological investigation of the nature of death is just as ludicrous as the
ontological inquiry into the nature of native attires. To every move in the
latter investigation there corresponds a move occurring in the writings of
the ontological explorers of death.
There is a familiar story about the boy who, before his first date, was
advised by his father to discuss three subjects—love, family, and philosophy.
Following his father’s advice and taking up love, he first asked the girl, “Do
you love noodles?” to which the answer was “no.” Remembering that he
should next discuss the topic of family, he asked his date whether she had a
brother. The answer again was “no.” This left only the subject of philosophy
and the boy now asked his final question: “If you had a brother, would he
love noodles?” I think it would be generally agreed that this last question is
absurd, but the ontological question about death is considerably more ab¬
surd. Although it would in almost any normal circumstances be utterly
pointless to inquire whether a hypothetical brother loves noodles, the ques¬
tion is not self-contradictory: We can describe what it would be like for a
girl who in fact has no brother to have a brother and what it would be like
for such a person to love or not to love noodles. We might even possess
some evidence supporting the claim that a given person’s brother would (or
would not) love noodles. The ontological question about death, on the
486 3. THE CONSTRUCTION OF THE GOOD

other hand, is self-contradictory. The question “How would Hume feel


about his death if he could attend to it?” is not merely pointless, but the
very meaning of the constituent terms makes it logically impossible to obtain
an answer. Hume (or anybody else) cannot both be dead and attend to his
deadness: If he is dead, then he can do no attending of any kind; if he can
attend to anything, he is not dead and hence cannot attend to his deadness.
While we can describe what it would be like for a girl to have a brother who
would (or would not) love noodles, we cannot describe what it would be
like for Hume (or anybody) to experience his deadness.
Miss Arleen Beberman, an existentialist explorer from New Haven, Con¬
necticut, finds the ontological quest beyond her capacities. “If we think or
imagine what it would be like to be dead,” she remarks, “we surreptitiously
introduce scenes of life and living people”14 and thus fail to reach our ob¬
jective. On the other hand, “if we do experience death,” we cannot “report
back from the encounter” {op. cit., pp. 18 and 22) and hence our efforts are
once again defeated. Miss Beberman decides that she will not aim at any¬
thing so ambitious as a “phenomenology of death.” “Such a goal,” she
writes, “is beyond my present intent since the method of coming to that goal
requires utmost rigor, boundless creativity, and plenty of time. I lay claim
to none of these” (pp. 18-19). In view of her limitations Miss Beberman
concludes that her efforts will be merely “episodically phenomenological.”
Modesty is a most becoming human trait, but here it is out of place. In the
present context even the most “creative” phenomenologist, with limitless
time on his hands, could not do any better just as a person with perfect
vision could not ever detect the clothes which make up a native attire and
just as an observer with the most sensitive and highly developed sense of
hearing could not discover the language in which somebody is silent.
Death is not a state and once this is clearly seen there is no temptation
to engage in an ontological quest and equally no temptation to regard death
as unknowable. Death is the absence of life and consciousness; and while
in this or that instance it may of course be unknown whether a certain man
is really dead (e.g., whether a Nazi leader was killed during the last days
of the war or whether he is hiding in South America after undergoing plastic
surgery), this is not something that is in principle undiscover able. Nor do
people, in spite of the general tendency to think of the dead as continuing in
a dark abode, have in practice the slightest difficulty understanding what is
meant by the assertion that somebody is dead. They understand such state¬
ments just as readily as they understand statements asserting that a certain
person was not yet born at a certain time or that somebody failed to show up
at a certain place or that he was silent or that a certain individual wore no
clothes.
Something should perhaps be said at this stage about the widespread
belief that death is unthinkable and unimaginable. In his discussion of
uGrenzsituationen,” Jaspers remarks:
PAUL EDWARDS 487

Death is something unimaginable, really something unthinkable. What we


imagine and think of in this connection are merely negations, merely associated
phenomena (“Nebenerscheinungen”) and never positivities.15

Jaspers is surely right in maintaining that when one thinks of such associated
phenomena as funerals or the mourning of the bereaved survivors, one is not
thinking of death itself, i.e., of the death of the person who died. However,
if life and consciousness are in the present context taken to be “positivities,”
then, in thinking of death, one would have to think of a “negativity.” If
thinking of President Kennedy’s life is thinking of a positivity, then thinking
of his death is thinking of the termination of his life—of the absence, the
nonoccurrence ever again of any actions or passions that would be part of
his biography. But this is apparently not enough for Jaspers and others who
are under the impression that death is a state. They presumably require that
a person, in order to think of death, should be thinking of a dark presence
and not merely of the termination of life and consciousness; and, since un¬
fortunately there is no such presence (or else it is impenetrably dark), one
will conclude that death itself, as distinct from side-phenomena and negativi¬
ties, is altogether unthinkable.16

THE PSEUDOEMPIRICAL PROCEDURES OF THE ONTOLOGICAL


EXPLORERS

The full ludicrousness of the ontological search is hidden from the ex¬
plorers (and presumably also from their less critical readers) by the em¬
ployment of certain highly misleading strategies. The first of these to which
attention should here be called is the frequent use of quasi-inductive tech¬
niques and language and the related claim, made by some ontologists, that
although the nature of death must remain largely unknown, a certain amount
of understanding has in fact been achieved by their methods. These strategies
suggest that the ontologists are engaged in a quest that is not in principle
different from the investigations of a scientist, however much more difficult
it may be because of the peculiar nature of the subject matter.
In this connection it is worthwhile to engage in a rather full examination
of the ontological “investigations” carried out by Professor Mora. It will be
recalled that, according to Professor Mora, we can “somehow” get on the
inside of death by studying the various attitudes that people display toward
death and that we can gain a little knowledge of what death is in those cases
in which there had been a community between us and the dead. Professor
Mora believes that in the latter kind of case the survivor, to some extent,
shares the dead person’s death and that as a consequence he obtains a little
knowledge of what this death is like on the inside. If a survivor and a dead
person did not form a community, the death in question is merely an “ex¬
ternal” event, but where there was a community, the death becomes more
488 3. THE CONSTRUCTION OF THE GOOD

than a merely external event. It seems clear that Professor Mora is misled
here by the pictures associated with the words “external” and “internal.”
No matter how much a person may be shaken by a given death, he cannot
get at it from the inside any more than a survivor who is altogether indif¬
ferent. He does not get inside the dead person’s death—not even a tiny bit—
not because he lacks some special gift of empathy which other human beings
possess or might conceivably possess but because there is nothing to get into:
There is nothing to get into since death is not a state or condition “of the
deceased” and since, if it were a state, it would not be one to which anybody,
the dead person or any survivor, could conceivably attend.
The word “share” is commonly used in a number of different senses.
For example we say that two people share a certain object, like a house
or a car or a restaurant, if both of them legally own it. Again, we say
that people share the same outlook or convictions—e.g., when both of them
are socialists or absolute idealists or admirers of Heidegger. Here what we
mean is that the two people have similar views or similar attitudes. When
Professor Mora claims that on certain occasions one human being can (to
some extent or somehow) share the death of another, he evidently has
neither of these ordinary senses in mind. In all likelihood he is thinking of
the sense in which we say of a person that he shares the grief or the suf¬
fering of somebody else if he is so sympathetic that, upon observing the
other person’s grief or suffering, he experiences a kind of duplication of
these in himself. In general, when we use the word “share” in this last
sense, we mean more than that the two people have similar feelings: We
mean that the first person is so attached to the second that the feelings of
the second immediately lead to similar feelings in him. A little reflection
makes it quite clear that “share” can no longer be intelligibly used in this
sense when a survivor is said to share the death of somebody else. For no
matter what the survivor feels, he is not reproducing death in himself. One
cannot be significantly said to “share” in this sense unless there is some¬
thing to share—something like grief or pain—and death does not qualify as
such a something. Of course, a person may in this sense share somebody
else’s dying—he may experience in himself the anguish or the serenity or
whatever emotions the dying person feels; but this is totally beside the
point since what the ontologist is out to explore is death and not dying.
A study of the attitudes of people toward death, whatever its intrinsic
interest, does not help the ontological quest along any more than a con¬
sideration of bereavements which are classified as more than merely ex¬
ternal events. In both cases Professor Mora seems to think that the
psychological data available to us (in one case the feeling of the survivor,
in the other the attitudes of the people who are thinking about their death)
are related to death itself somewhat like the reflections or images of an
object (in a lake or a mirror or on a photographic plate) are related to the
objects whose reflections they are. However, it is not and cannot be so.
PAUL EDWARDS 489

Any opinion to the contrary is bound to be the product of confusion.


Mora’s main confusion consists in an amalgamation of two questions which
are logically quite distinct. The first is the psychological question “How do
people face death?” The second is the ontological question “What is death
like from the inside?” or “What is deadness as it is to those who died?”
Mora manages to confound these questions by an ambiguous use of the
phrase “experience of death.” Neither of Mora’s uses can be regarded as an
ordinary sense,17 but it is easy to track down the ambiguity involved. In
one context Mora refers to that experience, if such a thing were possible,
which the dead person would have if he attended to his deadness. In the
other sense he simply refers to the feelings and attitudes of people who
contemplate their impending death. Let us call the former the “ontological”
and the latter the “attitudinal” sense. Professor Mora himself in one place
realizes that he is using the word “experience” in this ambiguous fashion
when he concedes that “no doubt an ‘attitude’ is not exactly the same as an
experience” (BD, pp. 193-194). This does not, however, prevent him
from proceeding as if no such ambiguity existed. He argues that since
people do experience death in the attitudinal sense they therefore have some
little experience of death in the ontological sense as well. But this is a
gross non sequitur and a most confusing amalgamation of two issues. To
the question “Do people have experience of their death in the attitudinal
sense?” the answer is clearly a ringing “yes,” while to the question “Do
people have experience of their death in the ontological sense?” the answer
is an equally ringing “no.” Professor Mora apparently thinks that by
amalgamating the two questions we can reach a happy compromise and
answer the question (suggesting that this is still just the original ontological
problem) with a hesitating and soft-spoken “yes.” We do not, using his
favorite image, ever obtain a full inside knowledge, but equally the knowl¬
edge we have is not “wholly from the outside.” Using this language, we
may express the real situation by saying that if by “the nature of death”
one is referring to nothing more than the ways in which people feel and
think about their death, then human beings have a very good knowledge
of death from the inside, while if by “nature of death” is intended what
the ontologists originally set out to explore, then we do not have even a
tiny bit of inside knowledge—or at any rate this in no way follows from
our inside knowledge of death in the other sense. Mora insists that when a
person thinks about his attitude he does in a sense stand “in front” of it.
This is not an unnatural way of speaking, but the “in-frontness” here
involved is not the in-frontness required by the ontologist. The in-frontness
required by the ontologist is the kind which occurs when human beings
look at a mountain or when they attend to their own feelings. In this
sense, when a person attends to his attitude toward death, he is “in front”
not of death but of his own feelings and thoughts about death.
Professor Mora maintains that he possesses some “little” knowledge
490 3. THE CONSTRUCTION OF THE GOOD

concerning the relation between his grandmother and her death, but he
nowhere tells us what this little knowledge consists of. This is not surprising
for the simple reason that Professor Mora has no such knowledge and can
have none. If anybody thinks otherwise this can only be due to the failure
to recognize in ambiguity similar to the one described in the preceding
paragraph. It should be noted that Mora does not adduce the fact that he
could not achieve more than a little knowledge about the relation between
his grandmother and her death as peculiar and exceptional. In other words,
it is not just Professor Mora who possesses no more than a little knowledge
in such a case, but all human beings are similarly handicapped and inevi¬
tably so, no matter how well they may have known the deceased, no matter
how close they may have been to him or her throughout life and throughout
the last days. Human beings do not even, in their own cases, have any
greater access to the relation in question. The reason for this is not, as
Professor Mora’s language suggests, some kind of empirical limitation like
that of a thief who cannot get into an apartment he wishes to rob because
he finds it impossible to break through the lock. The reason is the sense¬
lessness of the expression “X’s relation to his death” as this is used by the
ontologist in the course of his quest. This senselessness is obscured by the
fact that the expression “X’s relation to his death” also has a rather clear
meaning in other contexts. In nonontological contexts the question “What
is X’s relation to his death?” would be naturally interpreted to be a
means of asking for information about X’s attitude toward his death. Here,
while we may in this or that case be very ignorant about the person’s at¬
titude, we can frequently have a great deal if not indeed complete knowl¬
edge; and certainly the person himself very often has more than merely a
little knowledge. In this sense it seems to me that Professor Mora, having
known his grandmother very well and having spent much time with her
while she was dying, probably had more than a little knowledge of her
relation to her death. Or if he did not, the ignorance is not something that is
universal and inescapable. However, none of this is of any aid to the ontolo¬
gist. For what the ontologist is concerned with is not how people feel about
their death while they are alive but what death is, i.e., what it is to the dead,
what the loss is that the deceased has sustained—not how the deceased felt
prior to sustaining the loss. And if the question is taken in this ontological
way, Professor Mora has not little but no knowledge whatsoever of the
relation between his grandmother and her death. One would know some¬
body else’s relation to his death when the question is asked in the spirit of
the ontologist only if one were that other dead person and could then attend
to that person s deadness. This, however, is a logical impossibility even
if it were not a logical impossibilty to be somebody else. It is, as we pointed
out in the last section, a logical impossibility because if one is dead one
cannot do any attending. Not only can Professor Mora have no knowledge
PAUL EDWARDS 491

about the relation between his grandmother and her death, but in the
ontological sense even his grandmother herself can have no such knowledge.
It is important to realize that Professor Mora’s empirical arguments do
not “happen” to be invalid, but that they are bound to fail. What is most
objectionable about them is not their detailed defects but their very produc¬
tion in the spirit that empirical arguments of some kind might conceivably
provide clues to the nature of the object of the ontological search. Much
the same applies to “analogies” like that between the ripeness of a fruit
on the one hand and death on the other which Macquarrie (and Heidegger)
reject but whose very consideration suggests that we have here an inquiry
that might conceivably be carried on by means of analogical “indications.”
Macquarrie and Heidegger are right in rejecting such comparisons, but
they give the wrong reasons. In the case of the analogy between the fruit
and death, Macquarrie and Heidegger complain that the analogy breaks
down because the ripeness of the fruit is “the fulfillment of the fruit,” but
the end for a man may come when he is still immature or long after he has
passed the peak of his powers. Let us suppose, however, that all human
beings were to die precisely at the moment of their greatest fulfillment,
neither too young nor too old, i.e., when their powers are at their peak.
Let us suppose for example that Mozart had not died at the age of thirty-
four but that he had lived on until he was sixty-five when his powers
finally began to decline, and that Winston Churchill had not lived on into
a state of near-senility but that he had died shortly after the successful
conclusion of the war against the Nazis. Even if this sort of thing happened
universally, the analogy would break down for the simple reason that the
fulfillment or maturity of the fruit is a state of the fruit while the death of a
man is not one of his states—mature, immature, or any other kind. It is
conceivable that a certain kind of person, like Bardone in Rossellini’s
II General Della Rovere, would experience the greatest moments of ful¬
fillment in the course of sacrificing himself for somebody else or for a
cause; but these experiences would still be states of his living organism.
In such a case one may, using language loosely, say that the person’s
death was his greatest fulfillment. However, if one is talking sense, one is
really referring to what the person did or experienced while he was dying
or going to his death. It is important to bring out the proper reasons for
dismissing the above and other analogies since the reason given by Mac¬
quarrie and Heidegger suggests that if only human lives were different in
certain ways some of these analogies would work. Analogies in which death
is compared with a state cannot work, but of course there is not the
slightest need to introduce any of them in order to discover what death is.
As already pointed out we know quite well what death is and we no more
need “analogical clues” in the present case than we need them in order to
understand the nature of silence or of native attires.
492 3. THE CONSTRUCTION OF THE GOOD

THE SHIFT FROM THE ONTOLOGICAL PROBLEM TO


OTHER QUESTIONS

In both of his discussions of the ontology of death, Professor Macquarrie


reaches a stage at which he claims to have achieved “a preliminary under¬
standing” of death. This “preliminary understanding” consists in the con¬
clusion that death “is man’s untransferable possibility of being no longer
in the world.” I now wish to call attention to the following features of his
procedure: first, whatever one may think of the assertions to which Pro¬
fessor Macquarrie refers as “preliminary understanding”—whether they are
meaningful or not, true or not, important or not—they do not constitute
any kind of relevant answer to his original question. They are in this sense
not even a “preliminary” understanding. They tell us nothing about the
ontological character of death—the nature of the loss sustained by the
deceased, the nature of death as undergone by the dead. The statement that
a person cannot transfer his death to somebody else no more tells us anything
about the content of death than the statement that one human being cannot
transfer his native attire to another tells us what native attires are. Or, to
use a different illustration, in pointing out that nobody can eat or digest my
food for me or that nobody can do my sleeping or resting for me, one does
not explain what eating, digesting, sleeping, or resting consist in. Secondly,
in the remainder of his discussions Professor Macquarrie confines himself
exclusively to nonontological issues—chiefly to the psychological ques¬
tions “How do people in fact think and feel about death?” and “Do they
face it honestly or do they try to evade it and, if so, how?” and to what
we may call “moral” or “practical” questions like “How ought a person to
act in view of his inevitable death?” Practically the entire discussion in
both books after reaching the “preliminary understanding” is devoted to an
advocacy of the “authentic” attitude toward death (in which one “resolutely
anticipates” one’s “capital possibility” and even finds “joy in this mode of
life,” (SCE, p. 55) and to an analysis and condemnation of the inauthentic
approach of those who are in a “fallen” state and who “cover up” for them¬
selves the present possibility of death” (ibid.). Professor Macquarrie
began by telling us that he is out to discover “what the undergoing of
death may be like,” what is death for the person who has been robbed of
his being, and he rightly points out that the phenomenological method
encounters difficulties here. Before long these difficulties are overcome by
investigating not death but our present feelings about death. This transition
is effected with the greatest ease. “Existence,” we are told, “is dying, and
death is present to us and, in a way, accessible to us” {op. cit., p. 52, my
italics). And again “death is, . . . in a sense, already in the present. It is
already accessible, as thrown possibility, to the investigation of the existential
PAUL EDWARDS 493

analytic” (op. cit., p. 55, my italics). Any consistent ontologist ought surely
to protest that Professor Macquarrie simply abandons ontology for intro¬
spective psychology here. What is “present” and “accessible” is not death
but thoughts about death and no amount of qualifications (“in a way,” “in
a sense,” and many more I have not reproduced) can undo the difference.
What becomes accessible to phenomenological study or to the existential
analytic had always been accessible, and what had not been accessible (the
state of deadness as distinct from thoughts about death) is no more acces¬
sible after the shift than it had been before. Macquarrie first ruled out the
study of the death of others on the ground that although it “might teach us
much . . . about psychological reactions in the face of death,” it “can never
disclose death as an existential phenomenon,” but he ends up studying
precisely such psychological reactions.
It should be emphasized that Professor Macquarrie is by no means
alone in shifting from ontological to psychological and practical issues, and
it should also be noted that these psychological and practical questions
are not usually senseless. The failure to detect the shift and the intelligibility
of the questions to which the ontologists transfer their attention are per¬
haps as much responsible for their not perceiving ludicrousness of the
initial ontological quest as the use of such quasiinductive techniques as we
described in the last section. In Macquarrie’s case the main mechanism of
the shift is an ambiguity in the word “existential” as it is used in such
expressions as “existential character” or “existential phenomenon.” All
existentialists are agreed that death is more than a biological phenomenon;
and to this “more” they refer as the “existential character” or the “ex¬
istential aspect” of death. However, different existentialists and sometimes
the same existentialists at different times have different things in mind when
they speak of this “more.” Sometimes when we are told that the existential
character of death must escape the biologist, what is meant is indeed the
“ontological character” of death—the object of what we have been
calling the ontological quest; but at other times, when the limitations of
the public methods of biology are stressed, the writers refer to the inner18
feelings of anguish, horror, serenity, or whatever people experience when
they think about their death. Since the word “existential” is used in both
of these ways, Professor Macquarrie can maintain that all his answers are
answers to questions about the existential character of death and anybody
who is not attentive to the ambiguity just described would not notice the
shift that has taken place.
In Mora’s case the mechanism of the main shift is an ambiguity in the
word “understand.” Professor Mora sets out to “understand” death and
originally this means finding out what death is on the inside in the ontologi¬
cal sense. But he is also concerned with the question “Does death ever
(and perhaps always) have meaning or is it always or at least sometimes
an absurd happening?” To this latter question Professor Mora proposes
494 3. THE CONSTRUCTION OF THE GOOD

the answer that “human death is never completely meaningful, nor is it


entirely meaningless—it is meaningful and meaningless in varying degrees”
(BD, p. 186). I do not profess to understand either what he means by this
question or by his answer, but this does not affect the possibility of tracking
down his shift. It seems quite clear that when we say about something, x,
that we know what it means, it is permissible to express this by saying that
we understand x, whether x is a word, a phenomenon, or a theory. It is
thus quite natural for Professor Mora to believe that he has answered
his original ontological question after concluding that death is always
meaningful in varying degrees and that this meaning can be ascertained.
Whatever the merits of these last contentions may be, they do not con¬
stitute any kind of answer to his original question—they do not make
death “comprehensible” or “understood” in the sense in which these words
must be used when they express the ontological problem.

THE CLAIM THAT DEATH IS MORE THAN A “NATURAL”


PHENOMENON

All existentialists agree that death is more than a natural phenomenon


and that some nonscientific technique (variously called the “phenomeno¬
logical method” or the “existential analytic”) is required for the study of
its nonnatural aspects. This conviction is shared by existentialists who
actively pursue the ontological quest and by those who only occasionally
show some slight inclinations in that direction without ever setting out on a
full-fledged expedition. Professor John Wild, who belongs to the latter
group, offers the following considerations in support of the view that death
is not merely a natural phenomenon:

The existentialist thinkers have performed an important service in recalling


our attention to the actual phenomenon of personal death. They have shown
with great cogency and clarity that this is something more than the objective
biological stoppage which can be observed from the outside. The limited
methods of science can shed no light on this inner existential phenomenon
which is open only to philosophical description and analysis.19
The existentialist contributions to the phenomenology of death are also of
major importance. They have certainly shown the incapacity of naturalistic
and pan-objectivistic interpretations to account for the more important exis¬
tential phases of this mysterious and long-neglected phenomenon. In this sense,
death is not something universal. It concerns me as an individual. It is not a
replaceable, interchangeable function, but something I must face by myself
alone. It is not an event that I will observe in the future, but something that
I must either evade or face authentically here and now.20
There is a great deal that is objectionable in all this. With the claims
that death is a “mysterious” phenomenon and that I must face death “by
myself alone I shall deal in later sections. Right now, however, it is neces-
PAUL EDWARDS 495

sary to observe that although some of Professor Wild’s remarks are true,
they do not in any way imply his main conclusion about the existence of
aspects of death which cannot be studied by the “limited methods of
science.” To begin with, Professor Wild is quite right in calling attention to
the difference between what one may call the statistical and the personal
perspectives. It surely cannot be denied that a person’s state of mind is very
different when he gives his assent to the proposition that all men are mortal
from what it is when he realizes that he himself is one of those who will
inevitably die. Although this is certainly not something that existentialists
have discovered, people do perhaps on occasions forget it and it may well
be salutary to be reminded of it from time to time. To this, however, it
must be added that the differences are not peculiar to the subject of death.
It is exactly the same with thousands of other things—e.g., suffering im¬
prisonment unjustly or contracting a chronic and painful disease. The state
of mind of a person who reads in a book that 10 percent of all people
condemned to prison sentences are in fact innocent is very likely to be
significantly different from what it would be if he became one of those
convicted for a crime he did not commit; and the state of mind of somebody
who reads about what patients suffering from chronic arthritis go through
is likely to be very different from what it would be if he himself became
such a sufferer. Convictions, just and unjust, are phenomena that can be
studied by the methods of science and so can the feelings of those convicted;
and the same is true of arthritis and the states of mind of those suffering
from this disease. It is not easy to see why the admission that there is a
genuine difference between the personal and the statistical viewpoints
should imply that either the subject in question (be it arthritis, convictions,
or death) or the mental states which make up the personal viewpoint fall
outside the scope of scientific inquiry.
Professor Wild, like other existentialists, has a tendency to define “science”
in a misleadingly narrow way. It may be granted that death is “something
more than the objective biological stoppage which can be observed from
the outside.” Death is also the termination of consciousness. However,
for this insight it is not necessary to appeal to phenomenology or to the
existential analytic. Professor Wild no doubt in this context also thinks of
dying; and again it may be granted that the biologist, in studying the
physiological processes that go on in a dying organism, does not thereby
study the experiences of the individual which, from the human point of
view, are usually the most poignant aspect of the situation. Again, however,
from this it does not follow that science cannot study these inner experi¬
ences; and in fact Feifel and other contemporary psychologists21 have
amassed a good deal of interesting material which is not one whit less
scientific than the work of other psychologists who rely on the introspective
reports of their subjects. Some of Heidegger’s own most interesting com¬
ments about human attitudes toward death, which carry the wholehearted
496 3. THE CONSTRUCTION OF THE GOOD

endorsement of Professors Macquarrie and Wild, would, if true, be part of


this branch of scientific psychology.
Somebody might admit all of this but maintain that science cannot
tackle the practical and moral issues about death—how human beings
ought to face it and conduct their lives in the light of their inevitable doom.
This may be admitted although it is an exaggeration to say scientific in¬
formation can never have any bearing on such moral questions. It may be
conceded that science cannot answer these questions without however
conceding that there are some nonnatural features of death which require
investigation by the “existential analytic.” There are many other practical
questions which also cannot be answered (simply) by using scientific tech¬
niques. If a man is contemplating marriage and the question before him is
which of two women he should choose or, to take a less momentous ex¬
ample, if a person asks himself which tie he should wear with his new blue
shirt, science too does not provide the answers. However, it does not follow
from any of these admissions that the choice of a wife or of a tie are
phenomena with aspects that can be investigated only by some nonscientific
technique.

DYING ISOLATED AND ALONE

One of the most pervasive confusions in the writings of the existentialists


is their failure to distinguish between death and dying.22 The existentialists
themselves on occasions endorse this distinction in a general way. Thus
both Macquarrie and Mora quote, with apparent approval, Wittgenstein’s
dictum that “death is not an event of life”23—at any rate if they think that
Wittgenstein was wrong they nowhere give us their reasons. Such admissions
in general terms do not, however, prevent these writers from constantly
confounding death and dying in discussions of specific topics. Unlike death,
dying is a process or, in our use of the word, a state or succession of states,
and many of the existentialist pronouncements cease to be senseless when
they are interpreted as statements about dying. Thus it is not nonsense to
maintain that a person finds his greatest fulfillment in dying, although
cases of this sort are certainly very rare. It is not nonsense, but frequently
true, that a person while dying is undergoing a great deal of suffering.
Again, if we have in mind the anguish or the other feelings experienced by
a person who knows that he is dying, then it is conceivable that others may
share his dying in the same sense in which one sympathetic human being
may share the emotions of other human beings. Or, to take one of Tillich’s
favorite statements, it is indeed absurd to maintain that a person finds him¬
self, after death, engulfed by an impenetrable darkness, but similar remarks
about dying are not only not absurd but may well be true. Thus a German
physician, Johannes Lange, who studied patients dying very gradually of
PAUL EDWARDS 497

degenerative diseases, reports that as their life was slowly ebbing away, they
felt that they became more and more surrounded by darkness.24 In our
ordinary thinking we also frequently fail to keep death and dying clearly
apart, and this is one reason why the full ludicrousness of the ontological
quest is not always noticed. When the existentialists mean death (and I
am here referring to situations in which they must mean death if they are
to do ontology), many an innocent reader tacitly substitutes “dying” and
the resulting statements, though frequently false, are no longer senseless.25
The confusion between death and dying is unquestionably one of the
factors responsible for the extremely misleading assertions, endlessly re¬
peated by all existentialists, that all of us must die isolated and alone. “No
one can die my death for me,” writes Professor Wild, “this thing at least I
must do alone” (op. cit., p. 82, my italics). Death, he later remarks, “is
an actual act to be lived through by an individual alone” (p. 83, my italics).
Again, in a passage quoted previously, we are told that death is “something
I must face by myself alone” (p. 239, my italics). Professor Macquarrie
expresses this doctrine of what we may call “the privacy of death” by
declaring that death “isolates the individual. He must die himself alone”
(ET, p. 118, my italics). In a similar vein, though not using the word
“alone,” Professor William Barrett writes: “Death is not a public fact oc¬
curring out there in the world; it is something that happens within my own
existence.”26 When these writers maintain that human beings die isolated
and alone (and this of course is asserted not of some but of all human
beings), they presumably wish to claim more than merely that all human
beings eventually die. They give the impression and they themselves un¬
doubtedly believe that they are not merely redescribing the latter familiar
fact. However, if “alone” is used in any sense in which “all human beings
die alone” asserts more than that they all eventually die, it is quite clearly
false. In one natural sense of this expression, somebody dies alone if he
is physically isolated as in the case of a man who gets lost on an Arctic
expedition and freezes or starves to death before the rescuers arrive. There
is another sense in which one may quite naturally speak of somebody as
dying alone or in solitude, although the person need not be physically
isolated like the man lost in the Arctic. What we then mean is that, while
dying, the person is psychologically or emotionally isolated—he does not
greatly care about anybody else and nobody else cares much about him.
It is in this latter sense that Ivan Ilyich in Tolstoy’s moving story dies alone
although he has a dutiful wife and daughter. Now, it is clear beyond any
doubt that, while some people die alone in one and some in both of these
senses, others do not die alone in either sense. Winston Churchill, Louis
XIV, and David Hume, to cite some familiar cases, did not die alone in
either of these senses.
The existentialists suggest that there is a third sense in which “dying alone”
means more than just “dying” and in which all human beings necessarily die
498 3. THE CONSTRUCTION OF THE GOOD

alone. It is easy to show that there is no such further sense and that the
only sense in which it is true that all human beings necessarily die alone is
one in which “dying alone” is logically equivalent to “dying.” As the existen¬
tialists use “alone” (or rather as it must be interpreted if their statement is
not to be plainly false), it is logically inconceivable for a person not to die
alone. Let us suppose that a human being is not dying either in physical or
emotional isolation but is, on the contrary, experiencing the greatest and
deepest love and happiness of his entire life during his last days.27 As the
existentialists use “alone” in the present context, such a person would still
have to be described as dying alone for the simple reason that he is dying.
If “alone” had some additional content, then it would be possible to describe
what it would be like for a person to die without dying alone; but as the
existentialists use these words, such a description is not possible. The exis¬
tentialists seem to be saying something novel and of interest here and they
also seem to be saying something that is plainly true. However, the upshot of
our discussion is that if their statement is interpreted in such a way that it
says something interesting, then it is clearly false; while if it is interpreted so
as to make it true, it becomes nothing more than a rhetorical way of assert¬
ing the exceedingly familiar fact that everybody dies some day.
Much the same comment is applicable to the other formulation that is
commonly given to what we called the doctrine of the privacy of death. We
are told that nobody can get another human being to die for him, to act as
his substitute or representative in the matter of death; and this is put forward
as a statement asserting more than the familiar fact that everybody eventually
dies. However, the key expressions in these formulations are ambiguous:
When used in one sense, the statement to the effect that nobody can die
somebody else’s death is not platitudinous and goes beyond the assertion that
everybody eventually dies, but in this sense it is false; when used in another
sense the statement is true, but then it becomes a platitude simply reassert¬
ing that everybody dies some day. There is a perfectly natural sense in which
people can get others to die as their substitutes. During the French Revolu¬
tion the authorities in Paris would occasionally allow a man who had been
sentenced to death to leave his prison in order to attend to urgent business
provided that somebody else took his place as a kind of human bail. If the
person sentenced to death did not return by a given time, the man substi¬
tuting for him would be guillotined in his place. Heidegger and his followers
point out that in such a case the evil day is postponed and not ultimately
avoided: The person who absconded will some day die and then he will
not be able to have somebody else die in his place. In Heidegger’s own
words: “No one can die for another. He may give his life for another, but
that does not in the slightest deliver the other from his own death.”28 To
keep these two senses apart, let us insert the adjective “ultimately” when¬
ever we use the expressions in the sense in which it is clearly true that no¬
body can get somebody else to die as his substitute or representative. Now,
PAUL EDWARDS 499

it seems clear that what prevents a person from ultimately getting a sub¬
stitute is simply the fact that he will eventually die. It is part of the meaning
of he will die” that he cannot ultimately get a substitute. It is not one fact
that all human beings eventually die and a further fact that they cannot
ultimately get a substitute for their death: These are two different ways of
referring to the same fact. Suppose there were a tremendously powerful
tyrant who has for many years been in the habit of getting other people to
do the most varied things in his place. Whenever an unpleasant task comes
UP e-g-> to meet a foreign dignitary, to attend the opening of a boring play,
or to receive an honorary degree—he sends somebody else. When he is
challenged to a duel, he sends a substitute, and when he wishes to get rid
of dangerous opponents, he sends other people to do the killing for him.
After many years of this, he comes to believe that he can also (in our second
sense, i.e., “ultimately”) get somebody else to die for him. How would we,
if this were possible, convince the tyrant that he was mistaken? In effect he
believes that he will never die and to show him that here he cannot ultimately
get a substitute we would have to convince him that, like all others, he will
eventually die: We do not have to convince him first that he will eventually
die and then, separately, that he cannot ultimately get a substitute in the
matter of dying.
To the above criticisms of the doctrine of the privacy of death, it may be
replied that although not everybody is necessarily alone while dying, in
death this is the fate of all human beings; and this contention will appear
plausible to all, whether they are ordinary people or philosophers engaged
in the ontological quest, who are under the influence of the notion that the
dead person is sleeping in the grave or that he somehow continues to exist in
a dark abode. The man who is lying in his coffin is there all alone and his
loneliness would not be remedied even if we put a few corpses or perhaps a
few living people into the same coffin since he would not be able to converse
with them. Somebody under the sway of this picture might exclaim that the
dead are in fact more alone than any living person can be. For, unlike the
living they cannot obtain any of the relief that comes from talking about
one’s losses. The dead cannot talk to the living or to their fellow dead or
even to themselves. Their loneliness is thus seen to be truly staggering! Once
this is spelled out in full, the absurdity becomes quite obvious; while it is
false to assert that everybody dies alone in any sense in which this asserts
more than that people eventually die, it is senseless to say about anybody
that he is alone in his death. To be alone, one has to be alive, and the dead
are neither alone or not alone for the same reason that Julius Caesar is
neither an even nor an odd number and that feelings of anger are neither
blue nor red. As for Professor Barrett’s statement that death is not a “public
event,” it is appropriate to remark that unless death is taken as the inner
state of the deceased in which he has the experience of having no experiences
—and we saw that there were “difficulties” in conceiving death in this way
500 3. THE CONSTRUCTION OF THE GOOD

—death is a public event, though not one which the dead person can witness.
What is not or not exclusively a public event is dying, or more specifically
the experiences of the dying person; but these, it should be added, are no
more private than other feelings and thoughts.

THE “MYSTERY” OF DEATH

Almost as frequently as they assert that each person must die his death
alone, the existentialists make remarks to the effect that death is something
mysterious. They do not merely mean that this or that man’s death is a
mystery but that all deaths anywhere, at all times, and under all imaginable
circumstances are, and are necessarily, mysterious. Thus Professor Wild, in
a passage quoted previously, speaks of “this mysterious and long-neglected
phenomenon” (op. cit., p. 238) and earlier in the same work he observed
that “harsh, mysterious, and inexorable, death places all else in question
and reveals the uncanny strangeness of the world” (p. 84). In several places
Wild also asserts that death is “opaque to theoretical analysis” (p. 82) or
“opaque to understanding” (p. 81). Professor Tillich, needless to say, since
he regards death as the “end . . . with its impenetrable darkness,” concurs
in this opinion and adds that time too is a mystery.29 Professor Macquarrie
complains that people who treat death as an impersonal phenomenon are
thereby taking the “mystery and imminent threat” out of it (ET, p. 121).
The remarks about the mysterious nature of death just quoted are rather
cryptic, but there is a much fuller statement in a recent work by an English
writer who, while not calling himself an existentialist, expresses great sym¬
pathy for the movement. In his Existentialism—For and Against, Paul
Roubiczek praises the existentialists for offering valuable correctives of
errors associated with positivism and the thinkers of the Enlightenment.
“The mystery of death,” he writes, “even more than that of birth, is bound
to invalidate all the false convictions which survive from the Age of Reason.”
To this he adds:

Purely rational thought, though it can explain the causes of death in scientific
terms, can never account for the fact that we can die at any moment and are
beings who, in any case, must die sooner or later. The length of our lives seems
to be fixed in a purely arbitrary way which, being inexplicable, defeats the
power of reason (p. 113).

Perhaps none of the existentialist theses about death strikes a more respon¬
sive chord in ordinary readers than this claim that death is mysterious; and
it is echoed in countless statements found among poets, novelists, orators,
religious writers, and even psychologists.30 The feelings of helplessness and
horror which death inspires in most people seem to lead very naturally to
the remark that death is a mystery or to other remarks along the same lines.
It is probably sacrilege of the most damnable kind to subject such statements
PAUL EDWARDS 501

to a critical examination, but those who prefer clear thinking to nebulous


rhetoric will not wish to shirk this task.
Of the various senses in which the word “mystery” has been used, either
in ordinary life or by philosophers and theologians, there seem to be only
two in which the statement that death is a mystery could make any sense.
We refer to something as a mystery in one of these senses if we do not know
its cause or if we are ignorant of certain of its features. It is in this sense that
various diseases are mysteries even at the present time and it is in this sense
that somebody, accustomed to the beautifully simple arrangement of streets
and avenues in Manhattan, is liable to find Brooklyn or the North Side of
Chicago baffling mysteries. Although this or that death may well be mysteri¬
ous in this first sense, it cannot be reasonably maintained that the same is
true of every death: We frequently do know the causes of death as well as
the surroundng circumstances. In any event, as Mr. Roubiczek remarks
quite explicitly, this is not the sense in which he or the existentialists declare
death to be a mystery.
In the other sense we say of something that it is a mystery if it conflicts,
or at least if it appears to conflict, with some proposition that is neither well
established or extremely probable or at least fervently adhered to. Let us
suppose that a man whom we thought happy and exceptionally stable sud¬
denly suffers a psychotic breakdown or that he commits suicide. We would
be inclined to describe his breakdown or his suicide as mysteries. By this
we would mean that they cannot be reconciled with the proposition, ap¬
parently based on very strong evidence, that he was happy and stable. It is
in this sense that believers in an all-powerful and all-good God frequently
use the word “mystery,” when they concede that evil or at least certain forms
of evil found in the world are a mystery. Now, if somebody believes in such
a God and if he regards death as something evil and as the kind of evil
which an all-powerful and all-good God might have been expected to pre¬
vent, then his statement that death is a mystery makes perfectly good sense.
One may think that he is irrational in not abandoning his belief in an all-
powerful and all-good God in view of the facts of evil that cannot be recon¬
ciled with such a belief, but that is another matter with which we are not
here concerned. What does concern us here is that it is not open to those
existentialists who are not believers in such a God to regard death as a
mystery in the sense under discussion. And to this it should be added that,
as far as one can judge from their writings, those existentialists who are
believers in an all-powerful and all-good God do not mean this either. Al¬
though they may in fact be perplexed by the problem of evil, they are not
discussing this problem in any of its forms and shapes when they describe
death as a mystery.
I may of course be mistaken in thinking that the two senses just discussed
are the only ones in which the word “mystery” can be understood if the
declaration that death is a mystery is to make any sense. If I am mistaken,
502 3. THE CONSTRUCTION OF THE GOOD

I hope that an existential ontologist will come forward and tell us what other
sense there is in which death can be intelligibly characterized as a mystery.
If, however, I am right and the above two senses are the only ones to be
considered in this context, we may reach the following conclusion: It is
meaningful but false to maintain that death is always a mystery in the first
sense; religious believers could say something sensible by calling death a
mystery in the second sense, but the existentialists are not, and many of
them cannot be, using “mystery” in this sense.
We can perhaps obtain some understanding of what these writers are
doing by comparing their statements about the mystery of death with what
I have elsewhere called the “quasi-theological why.”31 People who do not
or who no longer believe in God nevertheless quite frequently ask such
apparent questions as “Why do I have to suffer so much?” or “Why is it
that, although I try so hard and mean so well, happiness in the end always
eludes me?” As asked by somebody who believes in a just and good God,
these are genuine questions, asking how the initial theological assumption
can be reconciled with the injustice and suffering experienced by the ques¬
tioner. However, when an unbeliever uses such language, we no longer
have anything that can be treated as a genuine question. What we have
before us are complaints about the nature of the universe, expressions of dis¬
appointment and perhaps despair that the operations of the world are not
in accordance with the individual’s moral demands. Similarly, when some¬
body like Mr. Roubiczek speaks of death as a mystery and excludes from
the start as irrelevant any information that science might provide, he may
well be using the word “mystery” in a quasi-theological way. He does not
seem to be raising a question but to be complaining about the “absurdity”
of death: He seems to be complaining that death occurs at all and, further¬
more, that there is no correspondence between the length of a human life
and the moral caliber of the particular human being.
It is a pity that Mr. Roubiczek does not identify the thinkers of the Age
of Reason whose “false convictions” he wishes to demolish. It is more than
doubtful that philosophers like Hume or Diderot would ever have wished
to dispute the assertion that death is a contingent fact or that the lack of any
correspondence between the length of a human life and the moral qualities
of the person in question is a feature of the world which cannot be further
explained. What they would probably have added to these admissions is that
one is not advancing the understanding of anything by referring to such con¬
tingent facts as “mysteries.”

It may be helpful to bring together the main conclusions reached in this


article:
1. There is a real difference between what we called the “statistical” and
the “personal” perspectives. This, however, is not peculiar to the subject of
PAUL EDWARDS 503

death; and the feelings and thoughts which constitute the personal per¬
spective can be made the object of scientific inquiry no less than other
psychological phenomena.
2. Death is the absence of life and is no more inconceivable than other
absences—e.g., the absence of sound or of clothes. Not only is death not in¬
conceivable, but in fact people conceive of it constantly and without the
slightest difficulty.
3. Although this or that death may be a mystery, it is not true that all
deaths are necessarily mysterious—at any rate the only people who could
justly make such a claim are believers in an all-powerful and all-good God
provided they also regard death as an evil which such a God might have
been expected to prevent.
.
4 The doctrine of the privacy of death, whether it is expressed by the
statement that everybody dies isolated and alone or by the statement that in
the matter of death one cannot have a representative, is either false or plati¬
tudinous, asserting no more than that everybody eventually dies.
5. The writings of the existentialists are pervaded by a confusion of death
with dying. Many of their pronouncements which are absurd when inter¬
preted as statements about death cease to be absurd when treated as state¬
ments about dying.
6. It is claimed by many existentialists that their question “What is
death?” is distinct from scientific questions about the nature of death, from
religious questions about survival, and from such metaphysical questions as
“Why does death occur at all?” Their question is said to be concerned with
the “ontological character” of death. However we found that when they
discuss the “ontological character” of death, the existentialists do one of
two things: They either address themselves to certain psychological and
practical issues in which case the use of the word “ontological” is highly
misleading; or else they engage in what we called the “ontological quest”
which amounts to asking “What does death feel like to the dead?” and this
turned out to be a grotesque pseudoinquiry. It would perhaps be claiming
too much to say that there is no genuine ontological question here, but if
there is one, this has yet to be demonstrated.

NOTES
1. I wish to thank my friends Martin Lean, Donald Levy, Margaret Miner, Mary
Mothersill and Elmer Sprague for reading an earlier version of this manuscript and
for making helpful suggestions.
2. The only person known to me who habitually slept in his own coffin was
“Lord” Timothy Dexter, an illiterate Yankee trader who made a fortune during the
Revolutionary War and who subsequently settled in Newburyport, Massachusetts.
There he built a Hall of Fame containing statues of Napoleon, Benjamin Franklin,
George Washington, George III, and himself as well as a mausoleum with an enor¬
mous coffin painted white and green. To enjoy the coffin while he was still alive,
Dexter had a couch put into it and not infrequently he took his nap on the couch.
Brahms was an eccentric man, but it was not his habit to sleep in a coffin.
3. The World as Will and Idea, Vol. Ill, p. 382.
504 3. THE CONSTRUCTION OF THE GOOD

4.
The Experience of Death, p. 13.
5.
The Courage to Be (from now on referred to as CB), p. 36.
6.
Systematic Theology, Vol. I, pp. 188-189.
7.
“The Eternal Now,” in H. Feifel (ed.), The Meaning of Death, pp. 30-31.
8.
Macquarrie’s discussions of death are contained in his An Existentialist The¬
ology (abbreviated from here on as ET) and Studies in Christian Existentialism
(from now on abbreviated as SCE).
9. From now on referred to as BD.
10. There is a constant shift in Mora’s discussion from talk about death to talk
about the possibility of death. I am ignoring this here because experiences of the
possibility of death” have nothing to do with the original ontological aim which,
in Professor Mora’s words, is “to scrutinize in detail the nature of human death”
(p. 170). In a later section of this article I shall provide a detailed account of the
chronic shifts of existentialists from ontological issues to psychological and moral
questions.
11. See page 475 above.
12. Adapted from Bertrand Russell’s Portraits from Memory, p. 147.
13. Social Principles and the Democratic State, p. 200. Needless to say, Benn
and Peters are not the first to make this distinction. It is already found in Aristotle
and Descartes.
14. “Death and My Life,” The Review of Metaphysics, 1963, p. 22.
15. Psychologie der Weltanschauungen, p. 261.
16. For a discussion of the peculiar arguments in support of the claim that al¬
though the death of others is conceivable and imaginable, one’s own death is not
see my article “My Death,” The Encyclopedia of Philosophy, Vol. V, pp. 416-419.
17. The only ordinary sense known to me in which we ever say of somebody that
he experiences death occurs in connection with people, like doctors, nurses, and
coroners, who frequently observe dead bodies (and dying patients). By saying that
these people “experience death” we mean that they habitually observe human beings
as they are dying and their dead bodies shortly after death has taken place.
18. The word “inner” is used ambiguously in much the same way as “existential ”
Biology, we are told, cannot explore the “inner” nature of death; and sometimes
this means the inner nature as it would appear to the deceased if he could attend
to it while at other times it just refers to attitudes toward death on the part of the

19. The Challenge of Existentialism, p. 218.


20. op. cit., pp. 238-239.
21. See H. Feifel, “Death—Relevant Variable in Psychology,” in R. May (ed )
Existential Psychology, the same author’s “Death” in N. L. Farberow (ed ) Taboo
Topics, and the contributions by C. W. Wahl, N. H. Nagy, R. Kastenbaum, H. Feifel,
u £ H^hnedcer, G. J. Aronson, and E. S. Shneidman and N. L. Farberow to
H Feifel (ed.), The Meaning of Death, op. cit. Feifel’s own contribution to this
volume contains a valuable bibliography.
22. One notable exception is Jaspers who is hardly ever guilty of this confusion
and who in fact makes a clear distinction between them on several occasions. Thus
he writes: Death cannot be an experience. Whoever has an experience is still alive”
(General Psychopathology, p. 477). Again: “Every report on dying persons refers
to their attitude to death, not to death itself” (p. 478).
,, text is as follows: “Death is not an event of life—death is not lived
through {Tractatus Logico-Philosophicus, 6.4311).
24. Quoted by Jaspers in his General Psychopathology, p 478
25. In one place Macquarrie remarks (SCE, p. 237) that although a “sinless”
person cannot avoid death any more than one who is “fallen” or living inauthen¬
tic tCir hS s'gH^cahtiy different—“the end of the ‘sinless’ person would
k! somehow different from death as we ordinarily know it.” Macquarrie, it should
*flmP\aTCud’ 1S n0t m aiY way referrin8 to some differences in the afterlife- but
Jr, so’, what hf says can make sense only if he means “dying” and not “death.” In
the absence of an afterlife, the deaths of the sinless and the sinful man are not “some¬
how” different. When they are dead, neither of them is alive and Neither can tTke
eristeSnceSfaCtl0n m anything or suffer any regrets about a previous inauthentic
26. What is Existentialism?, p. 63.
27. Cases of this sort are found both in literature and in real life. Thus Hume
the1 nerinrf nf deat|vwr°te> referring to the days of his final illness: “Were I to name
the period of my life which I should most choose to pass over again, I might be
PAUL EDWARDS 505

tempted to point to this later period.” Matthias Clausen, in Gerhardt Hauptmann’s


magnificent play, Vor Sonnenuntergang, is experiencing his greatest love and his
deepest feelings during his last months. Many other such examples could be cited.
28. Sein und Zeit, p. 240.
29. “We speak of time in three ways or modes: the past, present and future. Every
child is aware of them, but no wise man has ever penetrated their mystery. . . . The
mystery of the future and the mystery of the past are united in the mystery of the
present. . . . The mystery is that we have a present; and even more, that we have
our future. Also because we anticipate in the present; and that we have our past;
also because we remember it in the present. The present, our future and our past are
ours” (“The Eternal Now,” op. cit., pp. 31-36, 37). One cannot help wondering what
is troubling Tillich: Under what circumstances would time no longer be a mystery?
Unless Tillich can answer this question, it is not easy to see what is meant by saying
that time is a mystery.
30. Herman Feifel, the psychologist, who seems otherwise a sensible man and
not given to nebulous pronouncements, cannot refrain from speaking of death as
“the eternal mystery” (“Death—Relevant Variable in Psychology,” in Rollo May (ed.),
Existential Psychology, op. cit., p. 61).
31. See my article “Why?” The Encyclopedia of Philosophy, Vol. VIII, pp. 296-
302.
4. Historical Studies
THE QUADRATURE BY LUNES IN THE
LATER MIDDLE AGES
Marshall Clagett

One of the most popular mathematical problems in the Middle Ages


concerned the quadrature of the circle. This problem continually intrigued
both mathematicians and natural philosophers, the former because it made
them reach beyond Euclid’s Elements and the latter because it bore on the
question of the possible equation of rectilinear and curvilinear motions.1
Before the introduction of the Archimedean solution contained in his
Measurement of the Circle, Aristotle’s dictum in the Categories holding that
the solution was knowable but not yet known seems to have prevailed.2
From the late twelfth century the conventional solution was that of Archi¬
medes, as the various versions of his Measurement of the Circle (which I
published in my Archimedes in the Middle Ages, chapters 3 and 5), amply
illustrate. However, one proposed (but erroneous) solution by the means of
the quadrature of lunes (which had no relationship to the Archimedean
solution) circulated quite widely in competition with the Archimedean
solution. This was the quadrature of the circle by lunes described by Sim¬
plicius (from Alexander of Aphrodisias, who probably took it from some
early source other than Hippocrates of Chios whose lune quadratures are
also presented by Simplicius via Eudemus in the passage immediately after
the Alexandrian passage3) in his Commentary on the Physics of Aristotle
(Ed. of H. Diels, Berlin, 1882, pp. 56—57). I have already published two
medieval versions of this solution.4 The first was a verbatim translation from
the Greek, probably executed by Robert Grosseteste. The second was a
paraphrase of the first. The main difference between the two Latin versions
is that Version I faithfully includes, while Version II omits, Simplicius’ (i.e.,
Alexander’s?) comment that the proof is false “since that which was not
universally demonstrated is taken as universally demonstrated, for the
quadrature of every lune is not demonstrated but only that of the lune which
is subtended by the side of an [inscribed] square.”5 (The Greek text adds a
phrase missing in the Latin translation: “while the lunes [considered here]
are upon the sides of the hexagon described in the circle.”6) The omission
of the whole of Simplicius’ comment in Version II thus leaves the impression
that the quadrature of the circle can be accomplished by the quadrature of
lunes in the manner suggested in this proof.
508
MARSHALL CLAGETT 509

Still, Simplicius’ comment of admonition did not go unheeded in the later


Middle Ages. One author inserted in Version II a significant justification of
the basic premise between the initial proof of the quadrature of a lune on
the side of an inscribed square (hereafter to be called the tetragonal lune)
and the subsequent quadrature of the circle by lunes constructed on the sides
of an inscribed hexagon (hereafter to be called hexagonal lunes), which
hexagonal lunes were themselves considered squarable as the result of the
initial quadrature of the tetragonal lune. This insertion ran as follows:7

“Therefore, let it be supposed that it is possible to square a lune described on


the side of a square and, just as this is so, that it is possible also to square any
lune described on the side of any figure inscribable in a circle, as for example,
on the side of a hexagon. This supposition can be confirmed by that principle
which Campanus postulated in the first book of the Elements of Euclid and
which he uses in demonstrating XII.2 of Euclid. For just as the lune described
on the side of a square is related to the lune described on the side of a hexagon,
so any square is related to some other square, by that principle, [which is this:]
‘Any magnitude is to some second magnitude as any third magnitude is to a
fourth’ [Cf. ed. of the Elements, Basel, 1546, p. 3]. And you can easily deduce
by V.7, V.12, V.21 of Euclid that this square is equal to the lune described on
the side of a hexagon; however, you more easily deduce that by the use of
alternate ratios. With this [quadrature of the lune on the side of a hexagon]
presupposed, it will be demonstrated that any circle can. be squared.”

The author of this copy of Version II apparently felt that this so-called
proof justified the procedure followed in the succeeding proof. But it is clear
that this comment constituted not a “construction” proof but only an
“existence” proof. Briefly we can say that the author, following Campanus’
postulate (which is, in fact, misnamed since it appears in the Adelard
version of the Elements on which Campanus based his version), held
that for any ratio of lunes (Lj/L2), where, say, Li is the tetragonal
lune and L2 the hexagonal lune in the solution under consideration, there
must exist an equal ratio of squares (Q1/Q2), where Qi is any given square
and 02 is some other square, or L1/L2 = Q1/Q2. By the alternation of
ratios L1/Q1 = L2/Q2. But in the first part of the proof it is proved that
there is a Qi equal to Li; therefore, there must be a 02 equal to L2. How¬
ever, it should be obvious that we are not told how to construct such
a square Q2. (Incidentally, there is an interesting “existence” proof found
in the Quadrature of the Circle composed by the brilliant mathematician
and physicist Ibn al-Haitham [Alhazen],8 where Alhazen arrives at the
conclusion that a circle inscribed in the lune on the side of a square is
some determinate part of that lune. With that part assumed as found—for
he in no way could find that part by construction—the proof is success¬
fully completed.)
Beyond the emended text of Version II with its added justification, I have
recently discovered what I believe to be two further reactions to Simplicius
510 4. HISTORICAL STUDIES

comment, and their publication and analysis is the object of the remainder
of this paper. The first is included in a fifteenth century manuscript of
Glasgow University Library (BE 8.y.l8, 209v-210r). Its date of composi¬
tion is unknown, but since all of the other quadrature items included in this
section of the manuscript are from the thirteenth and fourteenth centuries,
I would guess that this piece was done no later than the fourteenth century.
It constitutes a third version of the Simplicius passage, in general much
closer to the paraphrase of Version II than to the translation of Version I.
But the fact that the author of Version III remarks on the falsity of the
basic assumption (see the text and translation below) links it to Version
I, which, I have no doubt, the author must have seen. The reason that
the author gives for the falsity of the assumption that the hexagonal lune
can be squared in the manner of the tetragonal lune is that “the greater the
polygon and the circle, the greater is the lune, because the circumference
of the larger circle which is less curved [for a given chord] cuts a smaller
segment from the lune and conversely the circumference of a smaller
circle cuts a greater segment from it because it is more curved.” The
author of Version III thus seems to believe that the false proof assumes
that the hexagonal lune is constructed on a line of the same length as that
on which the tetragonal lune is constructed, these lines being chords, as
they must, of unequal circles. Actually the original author of the false
proof does not have the two different lunes constructed on equal chords
but rather on chords that do have a fixed ratio (chords AD and LP in Figs.
1.2, 1.3). The original author apparently believed that, since the semicir¬
cles constructed on the sides of the hexagon were in fact twice the semicircle
constructed on the side of the square (i.e., semicircle LP = 2 semicircle
AFD), the respective lunes must have a fixed ratio of rectilinear figures
which could be determined by the fact that the lune on the side of the
square is equal to a quarter of the square (i.e., lune AFD ~ A ABD)
At any rate, the original author was certainly wrong in thinking that his
reference back to the quadrature of the tetragonal lune provided any
assistance in determining a rectilinear figure equal to the hexagonal lune
Now our medieval author of Version III goes no further in his criticism,
but, having had his say, reproduces the proof in a manner similar to
Simplicius’ account as it appeared in Versions I and II.
A second late medieval reaction to the comment by Simplicius on the
falsity of the proof is found, I believe, in the De arte mensurandi of
ohannes de Muris, an eminent mathematician and astronomer whose
career at the University of Paris can be traced from at least 1318 to some
time after 1345.9 Actually, in this work (written shortly after 1343) John
does not mention the comment of Simplicius, but rather he treats several
propositions on lunes (which were, in fact, partially drawn from the
Simplicius fragment) without tying them in any way to the problem of the
MARSHALL CLAGETT 511

quadrature of the circle. By taking the quadrature of lunes out of the


context of the quadrature of the circle he has left us with what was
essentially valid in the Alexandrian passage quoted by Simplicius. Of the
four propositions whose text I include below, two (Propositions 26 and
28) are found in the Simplicius piece while the other two (Propositions 27
and 29) appear to be original with John. Proposition 26 effects the quad¬
rature of the tetragonal lune and no doubt has its origin in the first par¬
agraph of the Simplicius fragment. Proposition 28 shows the equality of (1)
the sum of the three hexagonal lunes plus a hexagonal semicircle to (2)
half of the inscribed hexagon. While it was, in all probability, also drawn
from the Simplicius fragment, all references to Proposition 26 or to the
quadrature of the circle have been removed—no doubt because of Sim¬
plicius’ warning. The first of John’s additional propositions, Proposition 27,
holds that the sum of a hexagonal lune and its complementary lune on the
side of the third side of the triangle inscribed in a semicircle (of which
the side of the hexagon and the diameter of the semicircle are the other
two sides) is equal to one third of the inscribed hexagon (i.e., to the
triangle inscribed in the semicircle). This proposition and proof are quite
similar to a more general proposition proved by Alhazen in the course of
his quadrature of the circle,10 but there seems to be no trace of a Latin
translation of Alhazen’s work. The second of the additional propositions,
Proposition 29, shows the hexagonal lune to be a less than one-sixth part
of the inscribed hexagon, while the complementary triangular lune is shown
to be greater than that one-sixth part. I know of no possible medieval or
antique source of this proposition. It, like the others, was to play a role
in further propositions (which I have not here edited) that attempted
by numerical approximation to give the areas of lunes, segments, and
circles, assuming 374 as the value of n. My edition of Propositions 26-29
is based on two manuscripts of the De arte mensurandi, Paris, BN lat.
7380, ff. 47r-v (=£>) and BN lat. 7381, 99v-100v (=£), with the
figures taken from D alone since they are missing in E.
In both the text of Version III and the propositions drawn from the
De arte mensurandi I have capitalized the letters designating geometrical
quantities, although they are always minuscules in the manuscripts. I have
also capitalized the enunciations in the text of John’s propositions in
order to reflect the use by the scribes of a larger hand to write the enuncia¬
tions. The references made by Johannes de Muris to the Elements of
Euclid are to the version prepared by Campanus, and where the proposi¬
tion numbers differ from those in the Greek text I have indicated the
latter in parentheses in the translation. In my translation, I have followed
John in designating the lune on the side of a square as a “tetragonal
lune,” that on the side of a hexagon as a “hexagonal lune,” the semicircle
on the side of a hexagon as a “hexagonal semicircle,” and so on.
512 4. HISTORICAL STUDIES

NOTES
1. Thus Gerard of Brussels used the quadrature of the circle as given by Archi¬
medes in order to make such an equation; see M. Clagett, “The Liber de motu of
Gerard Brussels,” Osiris, Vol. 12 (1956), 112-120, 152-156.
2. M. Clagett, Archimedes in the Middle Ages, Vol. 1: The Arabo-Latin Tra¬
dition (Madison, 1964), 606-609.
3. T. Heath, A History of Greek Mathematics, Vol. 1 (Oxford, 1921), 183-200.
Heath gives a complete discussion of both the passage from Alexander and that from
Eudemus. It is the passage that Simplicius drew from Alexander which circulated in
the Middle Ages in Latin.
4. Clagett, op. cit. in note 2, 610-626.
5. Ibid., 619.
6. Ibid., 625.
7. Ibid., 620, variant reading to lines 14-19; 625.
8. H. Suter, “Die Kreisquadratur des Ibn el-Haitam,” Zeitschrift fiir Mathematik
und Physik, Vol. 44 (1899), Hist.-lit. Abteilung, 33-47.
9. L. Thorndike, History of Magic and Experimental Science, Vol. 3 (New
York, 1934), 294—324; P. Duhem, Le Systeme du monde, Vol. 4 (Paris, 1954),
34-38, 51-60.
10. Suter, op. cit. in note 8, 37-38: “Wir sagen: Wir ziehen in einem beliebigen
Kreis einen Durchmesser, nehmen dann auf einem der Halbkreise einen beliebigen
Punkt an, und ziehen von demselben zwei Gerade nach den beiden Endpunkten des
Durchmessers; hierauf beschreiben wir iiber diesen beiden Geraden zwei Halbkreise,
so sind die von den beiden Halbkreisen und den Bogen des ersten Kreises begrenzten
Mondfiguren zusammen gleich dem Dreieck im ersten Kreis. Wir haben diesen Satz
schon in unserm Buche iiber die Mondfiguren bewiesen, doch wollen wir den Beweis
hier nochmals wiederholen: Es sei der Kreis ABG gegeben (Fig. 1.1), sein Mittelpunkt
sei D, wir ziehen durch D den Durchmesser ADG und nehmen auf dem Umfang des
Kreises den Punkt B an, ziehen dann die beiden Geraden BG und AB, und be¬
schreiben fiber denselben die beiden Halbkreise AEB und BZG; nun sagen wir, dass
die beiden Monde AEBH und BZGT zusammen gleich dem Dreieck ABG seien.

FIGURE 1.1

Beweis: Von irgend zwei Kreisen verhalt sich der eine zum anderen wie das Quadrat
des Durchmessers des einen zum Quadrat des Durchmessers des andern, wie im
zweiten Satze des 12.Buches der Elemente bewiesen worden ist, also
Kreis BZG : Kreis BEA = BG2 : AB2;
durch Zusammenziehung ergiebt sich:
BG2 + AB2 : AB2 = BZG + BEA : BEA;
nun ist aber BG2 + AB2 = AG2, also
AG2 : AB2 = BZG + BEA : BEA.
Aber es ist auch AG2 : AB2 = Kreis ABG : Kreis BEA, also hat man:
MARSHALL CLAGETT 513

BZG + BEA : BEA = ABG : BEA,


mithin ist Kreis ABG = BZG + BEA, also auch
Halbkreis ABG = Halbkreise BZG + BEA.
Wenn wir nun die beiden Segmente AHB und BTG, die dem Kreise ABG und den
beiden Kreisen AEB und BZG gemeinschaftlich sind, (beiderseits) wegnehmen, so
bleibt: Dreieck ABG = den beiden Monden AEBH und BZGT zusammen, w.z.b.w.—
Wenn nun die beiden Bogen AHB und BTG einander gleich sind, so sind auch AB
und BG einander gleich, ebenso die beiden Kreise AEB und BZG, also auch ihre
Halften und ebenso die Monde AEBH und BZGT; ziehen wir noch BD, so sind auch
die beiden Dreiecke ABD und BDG einander gleich, also ist auch jeder einzelne der
beiden Monde gleich jedem einzelnen der beiden Dreiecke, also z.B. der Mond
AEBH gleich dem Dreieck ABD.”

I. [QUADRATURA CIRCULI PER LUNULAS: VERSIO III]


(Glasgow Univ. Library MS BE 8.y.l8, 209v-210r)

209v Quadratio circuli est hec. Sit circuli quadrandi semicirculus

FIGURE 1.2

ADC [Fig. 1.2]. Dyameter eius ABC. Certum est quod angulus D tri-
anguli ADC est rectus. Ergo lateris sibi oppositi, videlicet ABC,
quadratum, quod est A EC, valet duo quadrata descripta super duo
5 latera sibi opposita, AD et DC, que sunt DGC et aliud sibi oppositum.
Ergo per consequens duplum est ad unum illorum sicut ad quadratum
DGC cum ambo sint equalia. Sed sicut se habet quadratum super ali-
quam lineam descriptum ad quadratum super aliam descriptum sic se
habet semicirculus super primam lineam descriptus ad semicirculum
io super aliam lineam descriptum. Ergo semicirculus ADC descriptus
super lineam ABC erit duplus ad unum semicirculorum descriptorum
514 4. HISTORICAL STUDIES

super unum duorum laterum AD et DC et per consequens equalis duobus


sicut quadratum descriptum super lineam eandem ABC equale est duobus
21 Or quadratis super alia duo latera descriptis. Demptis/ergo que sunt
15 communia magno semicirculo et duobus parvis, scilicet duobus arcu-

bus H et I, que remanebunt erunt equalia, scilicet due lunule AFD


et DKC et triangulus ADC. Sed triangulus potest quadrari; ergo due
lunule possunt quadrari. Et sic lunula vel lunule possunt quadrari.
Tunc suppono quod lunula super latus cuiuscunque alterius figure
20 utpote exagone descripta possit quadrari sicut super latus quadrati

descripta: Quid tamen falsum est. Quanto enim figura plurium angu-
lorum et circulus maior, tanto maior est lunula, quia minorem portio-
nem ab ea abscindit circumferentia maioris circuli que minus est gel-
bosa (/globosa), et econtrario circumferentia minoris circuli maio-
25 rem abscindit portionem quia magis curva et per consequens lunula
minor.

FIGURE 1.3
MARSHALL CLAGETT 515

Supposito tamen hoc accipio lineam LMN duplam ad lineam ABC.


Ergo quadratum eius, scilicet MNO (/ LNO) quadruplum est ad quadra-
tum ACE, ut patet ad sensum, et per consequens semicirculus LPQN su¬
per dictam lineam descriptus quadruplus erit ad semicirculum ADC per
dictam propositionem, sicut se habet quadratum alicuius linee ad
quadratum alterius linee sic semicirculus ad semicirculum. Valet er¬
go semicirculus LPQN tres semicirculos LP, PQ, QN descriptos super
tria latera medietatis figure exagone et alium semicirculum quartum,
scilicet T, qui omnes sunt equales semicirculo ADC. Demptis ergo eis
communibus, scilicet tribus arcubus R,S,V, remanebunt adhuc equalia.
Sed ipso remanet MS medietas figure exagone, scilicet LPQN, et tres
lunule supraposite cum semicirculo T. Ergo ista sunt equalia. Sed
medietas exagone figure potest quadrari cum sit ex triangulis; ergo
totum residuum potest quadrari, scilicet tres lunule cum semicirculo.
Sed extracto quadrato equali tribus lunulis que ut probatum est pos-
sunt quadrari remanet adhuc quadratum, extracto enim quadrato a qua¬
drato remanet quadratum, et illud est equale semicirculo T. Ergo
semicirculus potest quadrari, ergo etiam totus circulus, quod est
quod intendimus. Explicit quadratura circuli secundum alium doctorem.

I. [THE QUADRATURE OF THE CIRCLE BY LUNES:


VERSION III]

The quadrature of the circle is this: Let ADC be the semicircle of the
circle to be squared, and ABC its diameter. It is certain that angle D of
A ADC is a right angle. Therefore, the square of the side opposite it,
namely ABC, which [square] is A EC, equals the two squares described
on the two sides opposite it (ABC), namely [the sides] AD and DC, which
[squares] are DGC and the one opposite it. Therefore, it (square AEC)
is double to either of those squares, as, for example, double square DGC,
since the two squares are equal. But the square described on some line
is related to the square described on another line as is the semicircle
described on the first line to the semicircle described on the second line
[by Euclid, Elements, XII.2]. Therefore, semicircle ADC described on
line ABC will be double the semicircle described on either of the two
sides AD and DC, and consequently it (semicircle A DC) is equal to
both [together], just as the square described on the same line ABC is equal
to the [sum of the] squares described on the other two sides. Therefore, if
the quantities common to the larger semicircle and the two smaller semi¬
circles—namely, the segments H and I—are removed, the remainders—
[on the one hand] the lunes AFD and DKC and [on the other] the triangle
ADC—will be equal. But the triangle can be squared; therefore, the two
lunes can be squared. And thus a lune or lunes can be squared.
516 4. HISTORICAL STUDIES

Then I suppose that the lune described on the side of any other figure,
as for example, that on the side of a hexagon, could be squared in the
same way as the lune described on the side of the square. This, however,
is false. For the greater the polygon and the circle, the greater is the lune,
because the circumference of a larger circle which is less curved [for a
given chord] cuts a smaller segment from the lunes, and conversely the
circumference of a smaller circle cuts a greater segment from it because it is
more curved, and consequently the lune is smaller.
However, if this is supposed [namely, that any lune can be squared],
I take line LMN = 2 line ABC. Therefore, its square LNO = 4 square
ACE, as is evident to the sense, and consequently semicircle LPQN
described on the said line will be quadruple semicircle ADC by the said
proposition: The square of some line is related to the square of another
line as [the one’s] semicircle is to [the other’s] semicircle. Therefore,
semicircle LPQN is equal to the three semicircles LP, PQ, and QN
described on the three sides of the semihexagon plus another fourth semi¬
circle, namely T, each of which semicircle is equal to semicircle ADC.
Hence, if the common segments R, S, V are subtracted, equal quantities
will still remain. But the remainders to this MS are (1) the half of the
hexagonal figure, LPQN, and (2) the three lunes posited above plus the
semicircle T. Therefore, these quantities [(1) and (2)] are equal. But
the half of the hexagonal figure can be squared since it is composed of
triangles. Therefore, the whole remainder can be squared, namely the three
lunes plus the semicircle. But with the square equal to the three lunes
subtracted (which lunes, as has been proved, can be squared), a square
still remains (for if a square is subtracted from a square a square remains),
and that square is equal to semicircle T. Therefore, the semicircle can be
squared, and also the whole circle, which is what we intended. Here ends
the Quadrature of the Circle according to another doctor.

II. JOHANNIS DE MURIS DE ARTE MENSURANDI,


CAPITULUM 6m
47r /26a propositio

LUNULAM TETRAGONAM QUARTE PARTI TETRAGONI CIRCULO INSCRIPTI


EQUALEM ESSE.
Sit circulus ABC supra centrum D [Fig. II. 1], perpendicularis
5 DB per 11 primi, BC latus tetragoni per 6 4s circulo inscripti super
quod semicirculus BFC describatur. Dico lunulam ex duobus curvis
contentam quam voco tetragonam equalem esse triangulo BDC, qui est
quarta pars tetragoni per 4 primi. Est enim angulus ABC rectus per
30 3“. Igitur quadratum AC valet duo quadrata linearum AB, BC
MARSHALL CLAGETT 517

FIGURE II.l

■——

10 per penultimam primi, et per consequens duplum ad quadatum linee


BC. Igitur per 2 121 circulus ABC duplus est ad circulum BFC, ad
medietatemque medietas. Erit ergo quarta circuli, que est BC,
equalis semicirculo BFC. Ergo dempta communi portione remanet lunula
equalis triangulo supradicto, quod est propositum.

FIGURE II.2

is Eodem modo potest ostendi de alia lunula tetragona, immo de


omnibus lunulis tetragonis pari ratione. Ex hoc clare patet [Fig.
II.2] quod portio tetragona circuli ABC equalis est ad duas tetra-
gonas portiones circuli BFC. Nam cum due medietates eiusdem circuli
sint equales per communem conceptum, demptis equalibus, scilicet
20 lunula tetragona et triangulus supradictus que sunt equales ut visum
est, remanet portio tetragona circuli ABC contra duas tetragonas
portiones circuli BFC; quare propositum. Et inde est quod medietas
eius uni illarum equalis est, licet hec sint in 24a huius anterius
demonstrata.

25 27a propositio
LUNULAM HEXAGONAM CUM LUNULA TRIGONA TERTIE PARTI HEXAGONI
CIRCULO INSCRIPTI EQUALEM ESSE NECESSE EST.
Sit circulus A EC super centrum D [Fig. II. 3], latus vero hexa-
goni AE, latus trigoni erit CE per 15 2* huius. Super igitur utrum-
que latus semicirculus figuretur. Dico lunulas protractas simul
sumptas equales esse triangulo A EC, qui est tertia pars hexagoni circulo
inscripti. Angulus enim A EC rectus est per 30 3i!. Ergo per penul¬
timam primi quadratum AC valet duo quadrata AE, EC\ igitur et circu-
518 4. HISTORICAL STUDIES

FIGURE II.3

ADC

lus circulos, et semicirculus semicirculos, per 2 12mi. Demptis


85 igitur communibus portionibus, restant due lunule supradicte equales
triangulo antedicto, quod est propositum.

28a propositio
TRES LUNULAS HEXAGONAS CUM SEMICIRCULO HEXAGONALI MEDIETATI
HEXAGONI CIRCULO INSCRIPTI EQUALES ESSE.

FIGURE II.4

4o Sit semicirculus A EC in quo sint tria latera hexagoni AE, EG,


GC [Fig. II.4] per 3 primi vel 15 4‘. Super quodlibet latus semi¬
circulus ambiatur ac unus semicirculus B exterius alteri illorum
equalis per 3 primi et ex diffinitione equalium circulorum. Dico 3
lunulas descriptas cum semicirculo B equales esse medietati hexagoni
45 que est AEGC. Est namque quadratum dyametri AC quadruplum ad quad-
ratum dyametri AE per 3 petitum 2l huius, et circulus ad circulum et
cetera sicut supra. Ergo demptis communibus portionibus hexagonis
restant 3 lunule hexagone cum semicirculo B equales semihexagono
supradicto, quod est propositum. Ex precedenti et presenti infertur
50 quod 4 lunule hexagone et lunula trigona cum semicirculo hexagonali
MARSHALL CLAGETT 519

B sunt equates quinque sextis hexagoni circulo inscripti, quod


sicut corollarium capiatur.

29a <propositio>
LUNULAM HEXAGONAM SEXTA PARTE HEXAGONI ESSE MINOREM TRIGO-
55 NAMQUE MAIOREM.

FIGURE II.5

Ambe lunule simul sumpte equates sunt tertie parti hexagoni,


scilicet triangulo A EC [Fig. II. 5], per precedentem, quem triangulum
tinea ED dividit in duos triangulos equates partiates A ED, EDC per
38 primi. Quorum quilibet est pars hexagoni sexta per communem con-
60 ceptum. Sed lunula trigona maior est hexagona, ut ostendam. Ergo
ipsa lunula trigona maior est triangulo EDC, quod est propositum.
Igitur et hexagona lunula tanto minor est triangulo A ED, quod volui
ostendere.
Suppositum declaratur. Stante eadem figura que prius, additur
65 linea DP ad verticem lunule hexagone et linea DO ad verticem lunule
trigone. Quia arcus AE dividitur per medium per DP, ideo et sua
corda in puncto /; similiter et corda EC in F. Statutum est paral-
lelogrammum rectangulum DIEF, quia angulus AEF rectus per 30 3“, et
angulus DIE [rectus] per 3 3", et per eandem angulus DFE; igitur
70 et angulus FDI per 29 primi. Quare per 27 et 28 primi DI, FE sunt

47v equidistantes et equates, et ideo El, DF equates. / Similiter et


El, IP, quia a centro. Ergo IP, DF equates. Igitur additur equali-
bus, scilicet FO addita ipsi DF, et DI addita IP; que quidem FO, DI
sunt equates, quia FO equalis EF cum a centro, et FE, DI equates,
75 ut visum est. Ergo exeunt DP, DO equates; a quibus equalibus amotis,

scilicet DN, DA, quia utraque semidyameter, restant QP, NO equates.


Cadit ergo maxima cathetus lunule hexagone intra trigonam. Ergo
trigona maior hexagona necessario esse convincitur, quod fuit
suppositum. Figurabilis ergo est lunula hexagona intra trigonam.
so Vice igitur linee AE subeat ST. Semicirculo itaque figurato SOT
520 4. HISTORICAL STUDIES

circulum contingente EOC in puncto O per 19 huius, palam est lunu-


lam hexagonam intra trigonam comprehendi. Pars est ergo hexagona
trigone; igitur minor ea. Quomodo autem vice linee AE subeat linea
ST docet 11 3U huius. Etiam tracta ST equidistanti EC aut facta
85 OG equali PI ducta perpendiculari GT in utramque partem, exibit ST
equalis AE. Vel fac arcum ST equalem arcui AE. Erunt corde equates
per 28 3U. Diviso circulo in 6 partes per 15 41, suppositis pro
una duabus igitur propositum.

Variant Readings for Johannes de Muris


1 propositio E,om.D 5 DB D AB E 12 Erit D Est E 17 equalis correxi ex
dupla 25 propositio E,om.D 26 lunulam hexagonam D lunula tetragona E
29 CE D EC E 33 AC D AEC E 36 triangulo D,om.E 53 29a D 29 mg. E /
propositio addidi 65 lunule1 mg. D portionis D lunule portionis E 66
medium D medietatem E / DP ?D YPO E 74 et D est E / equales2 ?D
equalis E 76 restant corr. ex restat 77,82 intra corr. ex infra 79 ergo est
D,tr. E / intra D inter E 83 Quomodo corr. ex quern in D que in E 87
circulo corr. ex semicirculo / pro E per D

II. ON THE ART OF MEASURING, CHAPTER 6, BY


JOHANNES DE MURIS

Proposition 26
A TETRAGONAL LUNE IS EQUAL TO ONE FOURTH OF A SQUARE INSCRIBED
IN A CIRCLE.
Let circle ABC be described about center D [see Fig. II. 1], with a per¬
pendicular DB [constructed] by 1.11 [of the Elements of Euclid], and with
BC the side of the square inscribed in the circle by IV. 6 [of the Elements].
And on the side BC let semicircle BFC be described. I say that the lune
contained by the two curves (and this lune I call a tetragonal lune) is equal
to ABDC, which is one fourth of the square by 1.4 [of the Elements]. For
1 ABC is a right angle by III.30 (= Gr. III.31). Therefore, AC2 = AB2 +
BC2, by the penultimate proposition of [Book] I [of the Elements], and
hence AC2 = 2 BC2. Therefore, by XII.2 [of the Elements], circle ABC =
2 circle EEC; hence semicircle ABC = 2 semicircle BFC. Therefore, the
quarter circle BC = semicircle BFC. Therefore, with the common segment
removed, the lune remains equal to the aforesaid triangle, which is that
proposed.
In the same way this can be demonstrated for another tetragonal lune,
in fact for all tetragonal lunes by like argument. From this it is clearly
evident that a tetragonal segment of circle ABC is equal to two tetragonal
MARSHALL CLAGETT 521

segments of circle BFC [see Fig. II.2]. For since two halves of the same
circle are equal by a common axiom, when [from these halves] two equals
are removed, namely the tetragonal lune and the above-said triangle (and
these are equal as has been seen), the tetragonal segment of circle ABC
is left equal to two tetragonal segments of circle BFC', hence the proposed
[corollary]. And thence it is that half of it is equal to one of those, although
these things might have been demonstrated earlier in the twenty-fourth
[proposition] of this [chapter]).

Proposition 27
IT IS NECESSARY THAT A HEXAGONAL LUNE PLUS A TRIANGULAR LUNE
BE EQUAL TO ONE THIRD OF A [REGULAR] HEXAGON INSCRIBED IN A CIRCLE.
Let circle AEC be [described] about center D [see Fig. II.3], while the
side of the [regular] hexagon will be AE and the side of the triangle CE,
by the fifteenth [proposition] of the second chapter of this [work, the De
arte mensurandi]. Therefore, on each side let a semicircle be drawn. I say
that the lunes drawn [on these sides] taken together are equal to /\AEC,
which is one third of the [regular] hexagon inscribed in the circle. For
Z AEC is a right angle by III.30 (= Gr. III.31). Hence, by the penultimate
[proposition] of [Book] I [of the Elements] AC2 — AE2 + EC2. Therefore,
circle [AEC is equal] to circles [AE and EC]; and semicircle [AEC is equal]
to semicircles [AE and EC]. Therefore, with the common segments re¬
moved, the above-said lunes remain equal to the afore-said triangle, which
is that proposed.

Proposition 28
THREE HEXAGONAL LUNES PLUS AN HEXAGONAL SEMACIRCLE ARE EQUAL
TO HALF OF A [REGULAR] HEXAGON INSCRIBED IN A CIRCLE.
Let the semicircle be AEC, in which there are three sides of the [regular]
hexagon: AE, EG, GC [see Fig. II.4], by 1.3 or IV.15 [of the Elements].
And let a semicircle be described on each side, as well as a semicircle out¬
side equal to each of these by 1.3 and the definition of equal circles. I say
that the three described lunes plus semicircle B are equal to one half of the
[regular] hexagon, which half is AEGC. For AC2 = 4 AE2, by the third
postulate of [Chapter] 2 of this [work]. And circle [AEGC equals] circles
[AE, EG, GC and B]; and so on, as above. Therefore, with the common
hexagonal segments subtracted, the three hexagonal lunes plus semicircle
B remain equal to the above-said semihexagon, which is that proposed.
From the preceding and the present it is inferred that four hexagonal lunes
and a triangular lune plus hexagonal semicircle B are equal to five sixths
of the [regular] hexagon inscribed in the circle, which conclusion is accepted
as a corollary.
522 4. HISTORICAL STUDIES

Proposition 29
A HEXAGONAL LUNE IS LESS THAN ONE-SIXTH PART OF THE [REGULAR
INSCRIBED] HEXAGON, AND [ITS COMPLEMENTARY] TRIANGULAR LUNE IS

MORE [THAN THAT QUANTITY]. .


Both lunes taken together are equal to one third of the hexagon, that is,
to AAEC [see Fig. II.5], by a preceding proposition (Prop. 27) _Line ED
divides this triangle into two equal partial triangles: AED, EDC, by 1.38
[of the Elements]. Each of these is a sixth part of the hexagon, by a com¬
mon axiom. But the triangular lune is greater than the hexagonal lune as
I shall show. Therefore, this triangular lune is greater than AEDC, which
is that proposed. Therefore, the hexagonal lune is by the same amount less
than A AED, which I have wished to demonstrate.
The supposition [in re one lune being greater than the other] is demon¬
strated: Keeping the same figure as before [Fig. II.5], line DP is added up
to the top of the hexagonal lune and line DO to the top of the triangular
lune. Since arc AE is bisected by DP, so is its chord bisected in point /.
Similarly, the chord EC is bisected in F. [Accordingly,] a rectangle DIEF
has been formed, for lAEF is a right angle by III.30 (=Gr. III.31) and
/ DIE [is a right angle] by III.30, and also Z DFE by the same proposi¬
tion; therefore, IFDI [is also a right angle] by 1.29. Hence, by 1.27 and
1.28, DI and FE are parallel and equal, and hence El and DF are equal.
Similarly, El and IP are equal since they are radii of the same circle. Hence,
IP and DF are equal. Therefore, these equals are added to equals, i.e.,
FO is added to DF and DI is added to /P; FO and DI indeed are equals,
for FO is equal to EF, both being radii of the same circle, and FE and DI
are equals, as has been seen. Therefore, DP and DO are equals. If from
these equals are removed DN and DA, equals because each is a radius, the
equals QP and NO remain. Therefore, the maximum perpendicular of the
hexagonal lune falls within the triangular lune. Therefore, the triangular
lune is necessarily demonstrated to be greater than the hexagonal lune,
which was that supposed. Therefore, the hexagonal lune can be drawn
within the triangular lune. Hence let ST substitute in place of line AE. And
so, with the semicircle SOT drawn tangent to circle EOC in point O by
[Proposition] 19 of this [chapter], it is evident that the hexagonal lune is
included within the triangular lune. Therefore, the hexagonal lune is a part
of the triangular lune, and so is less than it. Moreover, [Proposition] 11 of
[Chapter] 3 of this [work] teaches how to substitute ST in place of line AE.
Also with line ST drawn parallel to EC, or with line OG, which has been
made equal to PI, drawn perpendicular to GT in both directions, ST be¬
comes equal to AE. Or make arc ST equal to arc AE. The chords will be
equal by III.28 (=Gr. III.29). With the circle divided into 6 parts by
IV. 15 and with the two [halves of the hexagon] supposed instead of one,
the proposition follows.
ISAAC NEWTON’S PRINCIPIA, THE
SCRIPTURES, AND THE DIVINE
PROVIDENCE'
I. Bernard Cohen

Newton’s insistence that “the system of the world”2 gives evidence of the
Creator is well known to scholars through published and generally avail¬
able primary sources: chiefly the concluding General Scholium to the
Principia,3 the later Queries of the Gpticks? and the four letters he wrote
to Richard Bentley on the occasion of the latter’s two sermons on “The
Confutation of Atheism from the Origin and Frame of the World.”5 Be¬
cause the General Scholium was written only in 1712-1713 for the second
edition of the Principia and did not appear in the first edition in 1687,6 and
because the Queries of the Opticks dealing with God are not part of the
first edition of 1704,7 there appears at first glance to be some possible
justification for the view advanced by Biot and by Laplace in which New¬
ton’s concern with religious questions is referred to a late period of his life.8
During some fifteen years of intimacy with Newton’s writings, in manu¬
script as well as in print, during the preparation (in collaboration with
Alexandre Koyre) of an edition of the Principia with variant readings, I
have been made aware how false and misleading it is to relegate Newton’s
concern with theological matters to the end of his life, as a supposed con¬
sequence either of senility or a personal crisis that led to a mental break¬
down. The purpose of this article shall be to demonstrate primarily that
Newton’s concern with God and with the Divine Providence was a con¬
tinuing feature in all editions of his Principia, but that this information is
not available to the average reader. Secondarily, I shall show how this topic
is related to the form and significance of the Definitions at the beginning of
the Principia. (See Appendix I on pp. 537 ff. below.)
At least half of the General Scholium deals with religious topics. Para¬
graph three discloses some of the major regularities of the solar system,
leading to the conclusion that such a

most beautiful System of the Sun, Planets and Comets, could only proceed
from the counsel and dominion of an intelligent and powerful being.9

Having thus introduced this “being,” Newton quite logically proceeds to


delineate his nature and attributes as follows:

This Being governs all things, not as the soul of the world, but as Lord over all:
And on account of his dominion he is wont to be called Lord God navTOKparup,
523
524 4. HISTORICAL STUDIES

or Universal Ruler. For God is a relative word, and has a respect to servants,
and Deity is the dominion of God, not over his own body, as those imagine who
fancy God to be the soul of the world, but over servants. The supreme God is a
Being eternal, infinite, absolutely perfect; but a being, however perfect, without
dominion, cannot be said to be Lord God; for we say, my God, your God,
the God of Israel, the God of Gods, and Lord of Lords; but we do not say,
my Eternal, your Eternal, the Eternal of Israel, the Eternal of Gods; we do not
say, my Infinite, or my Perfect: These are titles which have no respect to
servants. The word God usually signifies Lord; but every lord is not a God.
It is the dominion of a spiritual being which constitutes a God; a true, supreme
or imaginary dominion makes a true, supreme or imaginary God. And from
his true dominion it follows, that the true God is a Living, Intelligent and
Powerful Being; and from his other perfections, that he is Supreme or most
Perfect. He is Eternal and Infinite, Omnipotent and Omniscient; that is, his
duration reaches from Eternity to Eternity; his presence from Infinity to In¬
finity; he governs all things, and knows all things that are or can be done. He
is not Eternity or Infinity, but Eternal and Infinite; he is not Duration or Space,
but he endures and is present. He endures for ever, and is every where present;
and by existing always and every where, he constitutes Duration and Space.
Following a lengthy discussion of such topics as whether God “is omni¬
present, not virtually only, but also substantiallyand whether blind
“metaphysical necessity, which is certainly the same always and every
where,” could produce a “variety of things,” Newton concluded this portion
of the Scholium10 with these words:

And thus much concerning God; to discourse of whom from phaenomena, be¬
longs to Experimental Philosophy.
A strong statement, written in enthusiasm! We may well understand that
Newton would have felt the necessity to tone it down in the third and final
edition, where it reads that thus to discourse of God, “from phaenomena,”
belongs to “Natural Philosophy.”
One part of this General Scholium is particularly fascinating to the
modern critical philosopher or historian of science: Newton’s discussion of
God, and his various names, from a linguistic point of view. In particular,
our attention is drawn by Newton to the fact that our expression God—
as in “my God, your God, the God of Israel, the God of Gods”—is different
from “Eternal,” “Infinite,” or “Perfect.” These are absolute terms or
“titles which have no respect to servants” and so are unlike “God [which]
is a relative word, and has a respect to servants.” In the light of this pres¬
entation, we should not find it too surprising that a second place in the
Principia where Newton introduces a religious note is also a discussion of
the relative and the absolute in terms of linguistic analysis.

The occasion is another famous Scholium, the one following the Defi¬
nitions, at the very beginning of the Principia: thus preceding even the
Axioms or Laws of Motion which, together with the Definitions, make up
the introductory material prior to Book One. In this Scholium Newton in-
I. BERNARD COHEN 525

traduces distinctions (with regard to Time, Space, Place, and Motion) of


Absolute and Relative, True and Apparent, Mathematical and Common.
Relative quantities, he concludes, are not “the quantities themselves, whose
names they bear.” Rather, they are the “measures” of such quantities
“which are commonly used instead of the measur’d quantities themselves.”
This leads Newton to the following conclusion, a primary statement for the
linguistic analysis of science:

And if the meaning of words is to be determin’d by their use; then by the


names Time, Space, Place and Motion, their [sensible] measures are properly
to be understood; and the expression will be unusual, and purely Mathematical,
if the measured quantities themselves are meant.11

Reading this exposition is to have the feeling of being half-way from


Aristotle to Bridgman!
Then Newton concludes the discussion in these words:

Upon which account, they do violence to the Sacred Writings, who there
interpret those words for the measur’d quantities. Nor do those less defile the
purity of Mathematical and Philosophical Truths, who confound real quan¬
tities themselves with their relations and vulgar measures.12

Newton’s argument is much like Galileo’s, when the latter sought to avoid
criticism of the Copernican system as a contradiction of Scripture. It is the
job of the Bible, said Galileo in a famous pun, to instruct us how to go to
heaven, and not to tell us how the heavens go! The Bible is written, for
greatest intelligibility, Galileo argued, in the language of the vulgar.13
One of the most fascinating aspects of Newton’s statement about the
Sacred Writings (i.e., Sacrae Litterae, Scripture, or the Bible) is that it is
not to be found in the version of Newton’s Principia current in the English-
speaking world. Andrew Motte’s English translation, as revised by Florian
Cajori, often reprinted and widely read, has a quite different sentence al¬
together. There we find:

On this account, those violate the accuracy of language, which ought to be


kept precise, who interpret these words for the measured quantities.14

How can Cajori have revised Motte so as to give such a different reading?
Let us turn at once to Newton’s original words in Latin, which are:

Proinde vim inferunt sacris literis, qui voces hasce de quantitatibus mensuratis
ibi interpretantur.

In other words, as Galileo had said, the Bible is written in the language
of ordinary discourse. Hence those who try to interpret Scripture in terms
of absolute measured quantities rather than the vulgar relative quantities
do violence to the Scriptures.
Lest there be any doubt on this question of Newton’s meaning, I shall
present the testimony of two members of Newton’s circle. First, John
Clarke, brother of the famous Samuel Clarke who participated in the
526 4. HISTORICAL STUDIES

“Leibniz-Clarke correspondence”15 and who translated Newton s Opticks


into Latin (1706). John Clarke translated into English Rohault’s famous
textbook of Cartesian physics containing brother Samuel’s notes notes
which contradicted the text by presenting the new Newtonian philosophy.16
In 1730, John Clarke published a short version in English of Newton’s
Principia, in part a direct translation and in part an extended paraphrase.
Presenting Newton’s distinction between “measured quantities themselves
and their “sensible measures,” Clarke concluded:
So that if we would define the Sense of Words, by the common use of
them, then by the Words Time, Space, Place and Motion, we are properly to
understand these sensible Measures; and if we should mean by them, the
Quantities themselves that are measured, our Discourse would be out of
the common way and merely Mathematical. Those Persons therefore who in the
Interpretation of Scripture apply these Words to the Quantities measured, do
violence to those Sacred Writings. . . .17
My second witness is Isaac Newton’s successor as Lucasian Professor at
Cambridge, William Whiston. In the English version of his Latin lectures
on the Newtonian natural philosophy, which proves to be—like Clarke’s
book—in part a translation of the Principia and in part a paraphrase with
critical comments, Whiston wrote:
Wherefore, if the Significations of Words are to be defin’d from their Use,
by the Names of Time, Space, Place, and Motion, these Measures are properly
to be understood; and the Expression will be unusual and purely Mathematical,
if the absolute Quantities themselves be understood. And therefore as they do
Violence to the Holy Scripture, who there interpret these Words, as intending
the absolute Quantities; so also do those who from the Rest assign’d to the
Earth, and Motion to the Sun, in the Words of the Scripture, are wont to
dispute concerning the true Frame of the World, contrary to evident Reasons
of Astronomy and Philosophy; as they do likewise, if such there be, who from
the Words wherein it is predicted, Time shall be no more, do from thence
collect, that Eternal Duration, or Absolute Time, shall be annihilated. Nor do
those any whit less defile Mathematicks and Philosophy, who confound the true
Quantities with their Relations and vulgar Measures.18

It will be observed that Whiston refers the reader directly to the question
of the literal interpretation of Scripture with regard to the motion of the
earth and the sun’s standing still.
As to Newton’s intended meaning, we may turn to other uses of “sacrae
litterae” in his manuscripts. In one of these, Newton himself presents a point
of view explicitly the same as Galileo’s and the one attributed to him by
Whiston in the quotation just given. Newton writes:
8 Systema corporum coelestium in sacris literis minime doceri.
[The system of the heavenly bodies is not at all taught in Scripture.]

12 Nihil obstare quo minus Terra pro lege Planetarum circa solem moveatur.
Diluuntur objectiones ex sacris litteris.
[Nothing stands in the way of the Earth’s moving around the Sun according
to the law of the Planets. Objections from Scripture are removed.]
I. BERNARD COHEN 527

13 Diluunter objectiones ex mechanica.


[Objections from mechanics are removed.]19
In an “Avertissement au Lecteur,” written out by Newton for Clarke to
use as his introduction20 to a French edition of the “Leibniz-Clarke cor¬
respondence,”21 Newton refers in English to “the scriptures” in almost the
very same context as the Scholium to the Definitions (in the Principia)
which we have been discussing in this article. Newton says:
& as the scriptures generally spake of God by allusions & figures for want of
proper language: so in these Letters the words Quality 8l Property were used
only by a figure to signify the boundless extent of Gods existence with respect
to his ubiquity & eternity, & that to exist in this manner is proper to him alone.22

For Newton “sacrae litterae” was clearly and unquestionably to be trans¬


lated by Scripture, or the Scriptures.23
On the eve of writing the Principia, that is, in 1685, Newton wrote out
a tract which he either never completed or which has been preserved only
as a fragment, entitled De Motu Corporum in medijs regulariter cedentibus
(On the motion of bodies in regularly yielding media). In Definition 18
we find that Newton has given us a pre-draft of the passage under dis¬
cussion, but now related to the general proposition that all quantities meas¬
uring times, spaces, motions, speeds, and forces are proportional to that
which they measure. Then Newton says, in explanation,
It seemed best to explain all these things rather fully so that the Reader
might approach the following [material] freed from certain common prej¬
udices and imbued with the distinct concepts of Mechanical principles. It
was necessary [replacing I have tried], moreover, carefully to distinguish abso¬
lute and relative quantities from one another; because all phaenomena depend
on absolute quantities, and yet the common people, who do not know how to
abstract their thoughts from their senses, always speak of relative quantities,
to the point where it would be absurd for either wise men or even for the
Prophets to speak otherwise among them. Whence both the Scriptures and the
writings of Theologians are always to be understood of relative quantities,
and he would be laboring with a gross prejudice who thence [i.e., on the basis of
those writings] stirred up disputations about the philosophical [apparently
replacing absolute] motions of natural things. It’s just as if someone should
contend that the Moon in the first chapter of Genesis was counted among the
two greatest lights not by its apparent, but by its absolute, magnitude. [Last
sentence canceled.]24

The Biblical reference is:


[Genesis 1:16] And God made two great lights; the greater light to rule the
day, and the lesser light to rule the night: he made the stars also. Fecitque
Deus duo luminaria magna: luminare majus ut praeesset diei, et luminare minus,
ut praeesset nocti, et Stellas.

After completing the tract just mentioned, Newton wrote out a draft of
the Definitions, Laws of Motion, and Book One of the Principia, of which
a considerable fragment is preserved, having been deposited by Newton in
528 4. HISTORICAL STUDIES

the University Library as if it comprised the lectures which he was required


to give and so to deposit according to the terms of his professorship.In
this manuscript, the Scholium following the Definitions differs from the later
and final printed version in one respect that is directly related to the subject
of this article. The paragraph following Newton’s discussion of sensible
measures” and “measured quantities” begins:
It is indeed a matter of great difficulty to discover, and effectually to dis¬
tinguish, the true motions of particular bodies from the apparent; because the
parts of that immovable space, in which those motions are performed, do by no
means come under the observation of our senses.26

In this early draft, or the so-called Lucasian Lectures, the next sentence
begins:
Solus enim Deus, qui singulis immobiliter et insensibiliter

a beginning which is canceled. From the next sentence,27 it may be con¬


jectured that Newton intended to write:
For only God, who [gives motion to] individual bodies without moving and
without being perceived, [can truly distinguish true motions from apparent.]

Clearly, Newton needed to exercise great strength of will power to refrain


from making these excursions into theology in the final version of the
Principia.
If the evidence is so clear as to the meaning of what Newton wrote and
if the Latin appears to be so unambiguous, how did the Motte-Cajori
version originate? Not with Andrew Motte, who knew his Latin, and who
translated these sentences quite properly as follows:
Upon which account, they do strain the Sacred Writings, who there interpret
those words for the measur’d quantities. Nor do those less defile the purity of
Mathematical and Philosophical Truths, who confound real quantities them¬
selves with their relations and vulgar measures.28

Cajori’s “improvement” is taken from an eighteenth-century translation of


the Principia by Robert Thorp, of which only the text of Book One was
published (in two editions), although a prospectus29 was published an¬
nouncing that the whole of the three Books had been translated and would
be published.30 I presume that Cajori preferred Thorp’s version to Motte’s
not so much because he considered it to be a more accurate rendition of
Newton’s Latin original,31 but rather because it had more of the ring of
what Cajori thought appropriate to a mathematico-physical treatise.32

I have said earlier that a mention of God and a discussion of the Divine
Providence was a continuing feature of all editions of Newton’s Principia.
In Book Three (in which Newton displays the system of the world on the
I. BERNARD COHEN 529

basis of the mathematical principles expounded in Books One and Two),


Corollary 4 of Proposition VIII discusses the densities of the planets in
relation to their relative sizes and distances from the sun. In the second and
third editions (1713; 1726), this corollary states:

The smaller the Planets are, they are, caeteris paribus, of so much the greater
density. For so the powers of gravity on their several surfaces, come nearer to
equality.

Then Newton goes on to explain that the planets are also,

caeteris paribus, of the greater density, as they are nearer to the Sun. So Jupiter
is more dense than Saturn, and the Earth than Jupiter. For the Planets were to
be placed at different distances from the Sun, that according to their degrees
of density, they might enjoy a greater or less proportion of the Sun’s heat.

This may be taken as an example, I believe, of what Newton had in mind


in the General Scholium when he said that the solar system “could only
proceed from the counsel and dominion of an intelligent and powerful
being.” As evidence that each planet is so placed that it receives just the
proper amount of heat, no more and no less, Newton presents the following
information:

Our water, if it were remov’d as far as the orb of Saturn, would be converted
into ice, and in the orb of Mercury would quickly fly away in vapour. For the
light of the Sun, to which its heat is proportional, is seven times denser in the
orb of Mercury than with us: and by the thermometer I have found, that a
sevenfold heat of our summer-sun will make water boil. Nor are we to doubt,
that the matter of Mercury is adapted to its heat, and is therefore more dense
than the matter of our Earth; since, in a denser matter, the operations of
nature require a stronger heat.33

This passage may afford us a glimpse into Newton’s all-but-continuous


revision of his Principia from 1687, when the first edition was published,
until 1726, when the third and ultimate edition appeared. In the first
edition, this material on the properties of planets and their relative positions
is presented somewhat differently. This topic is the subject of Corollary 5
rather than Corollary 4 and it is only in part equivalent to the later Corol¬
lary 5 just quoted. That is, only the final section—beginning “Our water,
if it were removed” and ending “require a stronger heat”—is identical in
both. In the first edition, the previous sentence reads:

Collocavit igitur Deus Planetas in diversis distantiis a Sole, ut quilibet pro


gradu densitatis calore Solis majore vel minore fruatur.34

This may be compared with the later version:

In diversis utique distantiis a Sole collocandi erant Planetae ut quilibet pro


gradu densitatis calore Solis majore vel minore frueretur.

That is, in the first edition, Newton would have us believe that:
530 4. HISTORICAL STUDIES

God therefore placed the planets at different distances from the Sun so that
according to their degrees of density they may enjoy a greater or less proportion
of the Sun’s heat.
But in the later editions, it is merely said that “the planets were to be
placed” at such distances that “they might enjoy” their due proportion of
the sun’s heat, the explicit reference to God (“Deus”) having been elimi¬
nated.
Hence, we see that in the first edition Newton did mention God ex¬
plicitly. It is thus an error to say that Newton introduced the name of God
and his divine providence only in the later editions of the Principia (ed. 2,
1713; ed. 3, 1726), in the supplementary General Scholium. This example,
incidentally, serves to illustrate the importance of having available a critical
edition of the Principia, with a complete set of variae lectiones to enable
scholars to find out what was eliminated and added, and also what was
altered, from one edition to the next.
Why would Newton have eliminated this explicit reference to God in his
revision of the corollary in question? At first glance, it might appear
that he had decided to confine all explicit mentions of God to the final
General Scholium. For, indeed, he does mention God by name in the
second and third editions only in this General Scholium,35 while in the first
edition God is mentioned only in the corollary to Proposition VIII (Book
Three). But this opinion is confuted by the facts, since Newton eliminated
the reference to God in this corollary in the interleaved copy of the Prin¬
cipia in his personal library,36 apparently long before he had even con¬
templated a General Scholium. It is a reasonable guess that he would have
eliminated this reference to God after he had read the review of the first
edition of the Principia which was published in the Acta Eruditorum, for
in it the anonymous reviewer especially called the reader’s attention to the
use of planetary densities in Newton’s presentation of how God had created
the universe.37 Newton might very well have concluded that this topic
either required a more considerable discussion, perhaps with further ex¬
amples,38 or else should not be mentioned at all.
But it is of more interest than a mere curiosity that Newton should have
mentioned God in all editions of the Principia, and not just in the second
and third editions, as is usually stated. For we know that Newton was of
strong religious bent, and that he conceived the structure of the world to
offer definitive proof of the existence of God and his continuing divine
providence. As he wrote to Bentley,

When I wrote my Treatise about our System [of the World, i.e., Principia
Lib. Ill], I had an Eye upon such Principles as might work with considering
Men, for the Belief of a Deity, and nothing can rejoice me more than to find
it useful for that Purpose.39

In other words, it was more in character for Newton to have included God
in all editions of his account of the “System of the World” in Book Three of
I. BERNARD COHEN 531

the Principia than it would have been had God been absent in the first
edition. (See Appendix II on pp. 542 ff. below.)

The second edition (1713) of the Principia has much new material on
comets, the subject with which Book Three of the Principia concludes in
the first edition (1687).40 Among the manuscripts in which Newton worked
out his attempted revisions and extensions of his presentation of the subject
of comets, there is a sheet containing a proposed alteration of Proposition
XLI (Book Three) which is especially fascinating. This single sheet shows
us how the very act of writing on cosmological questions almost auto¬
matically was apt to lead Newton to questions about God similar to those
he would discuss in the General Scholium. This fragment reads as follows:

Moreover, the vapors which arise from the sun, fixed stars, and tails of Comets
seem to be condensed in the Planets and to be converted first into water and
moistures, then into mud and clay and salts and sand and the substances of
animals vegetables and minerals. Thus comes about the perpetual interchange
of all things; and the Lord of all alone remains immutable, who by his own
counsel and will disposes (through Ministers) all things in the best order, by
removing the fixed stars to convenient distances lest they fall into one another,
by placing the Planets in concentric orbs in the same plane with conspiring
motions, and the comets in excentric orbs and various planes with contrary
motions, and by granting to each one its just determination of motion and
velocity by which they may be able to describe their designated orbs, and by
forming in the beginning the species of all things in first seeds. It is philosophic
to set forth by what laws and in what way [or “by what reasoning”] the system
of things once set by God is conserved and perseveres; but how this most wise
order of things could have arisen from the counsel and will of its founder,
natural Philosophy does not teach [altered from it is not relevant to natural
Philosophy to set forth].

A canceled passage reads:

but to set forth how this most wise order of things could have arisen from
matter alone and motion or from the Nature of things, necessarily acting,
eternal, infinite, most wise, most powerful and supremely perfect, or from any
blind fate without final causes and therefore without God the Lord of all
things, is most foreign to Philosophy.41

To complete this subject,42 it will be of interest to see how, in the versions


of the Principia prior to the one from which the first edition was printed,
Newton on at least two occasions turned to discussions of God. The first of
these we shall consider is an early version of “The System of the World,”
which Newton later rejected for a completely rewritten “Book Three on the
System of the World.” Much confusion exists between the two, because after
Newton’s death, the early work was published separately, and has the title
De Mundi Systemate, which is the same as the title of Book Three of the
Principia. A number of manuscript copies of this work exist, but the full
532 4. HISTORICAL STUDIES

text (which served as the basis of the final or corrected printed edition)
is to be found among Newton’s papers in the Portsmouth Collection in the
University Library, Cambridge.43 In this work, Newton—as might have
been expected—discussed God in relation to the same subject in which
God is introduced in Book Three. Newton states as a proposition:

Why some of the planets are more, others less dense, and the forces in all are
proportional to the quantities of matter.

Then he discusses it as follows:

That bodies so different in magnitude should come so near to a proportionality


with their forces, is not without some mystery.
It may be that the remoter planets, for want of heat, have not those metallic
substances and ponderous minerals with which our earth abounds; and that
the bodies of Venus and Mercury, as they are more exposed to the sun’s heat,
are also harder baked, and more compact.
For, from the experiment of the burning-glass, we see that the heat increases
with the density of light; and this density increases inversely as the square of
the distance from the sun; whence the sun’s heat in Mercury is proved to be
sevenfold its heat in our summer seasons. But with this heat our water boils;
and those heavy fluids, quicksilver and the spirit of vitriol, gently evaporate, as
I have tried by the thermometer; and therefore there can be no fluids in
Mercury but what are heavy, and able to bear a great heat, and from which
substances of great density may be formed.
And why not, if God has placed different bodies at different distances from
the sun, so that the denser bodies always possess the nearer places, and each
body enjoys a degree of heat suitable to its condition, and proper for its con¬
stitution? From this consideration it will best appear that the weights of all
the planets are one to another as their forces.44

Newton then expresses his desire for more accurate measurements of the
diameters of the planets, and presents a method for achieving this end.

So deep were Newton’s religious convictions, and so strong his need of


expressing them, that again and again, while writing out the Principia, he
was drawn to make an explicit expression of his faith. The result was an
introduction of a reference to God and to the Divine Providence in both
versions of “The System of the World” and also in the early drafts of Book
One of the Principia which Newton later deposited in the University Library
as the lectures he had given as Lucasian Professor. The latter is perhaps
the most interesting of all, since the MS is written out in the hand of an
amanuensis, Humphrey Newton. Since only a fragment of the sentence is
written out, and then canceled, we may presume that Newton was dictating
to Humphrey, and that the discussion that it is God alone “who [gives motion
to] individual [bodies] without moving and without being perceived” came
out almost unwittingly. In the midst of his sentence, however, Newton evi¬
dently had second thoughts and directed Humphrey to cancel this expression
of his religious views. The very existence of this fragment, however, like the
I. BERNARD COHEN 533

other statements about God we have been presenting here, may serve as a
continual reminder of how great the temptation always was for Newton to
stray from the strict and narrow path of science and to meander through
theological metaphysics.

NOTES
1. The research on Newton’s scientific thought, on which this article is based, has
been supported by a grant from the National Science Foundation.
A somewhat different version of this article is being published in the writer’s forth¬
coming volume, Newton’s Natural Philosophy: Inquiries into Newton's Scientific
Work and Its General Environment {in process).
I am grateful to Anne Whitman, who has worked with me on the transcription,
translation, and interpretation of Newton’s manuscripts.
2. Book Three of Newton’s Principia is entitled “The System of the World”
(“De Mundi Systemate/Liber Tertius”). An earlier version was published post¬
humously and bears the title De Mundi Systemate. Florian Cajori, who revised
Andrew Motte’s translation of the Principia (1729), Sir Isaac Newton’s Mathematical
Principles of Natural Philosophy and his System of the World (Berkeley and Los
Angeles: University of California Press, 1934) attempted to distinguish between
the two by entitling Book Three of the Principia “System of the World (in Math¬
ematical Treatment),” but without indicating that the parenthetical expression was
not Newton’s. For a guide to Newton’s writings, see George J. Gray: A Bibliography
of the Works of Sir Isaac Newton, Together with a List of Books Illustrating His
Works (Cambridge: Macmillan and Bowes, 1888; second edition, considerably revised
and enlarged, Cambridge: Bowes and Bowes, 1907); H. Zeitlinger: “A Newton
bibliography,” pp. 148-170 of Isaac Newton, 1642—1727, a memorial volume edited
for the Mathematical Association by W. J. Greenstreet (London: G. Bell and Sons,
1927); A descriptive Catalogue of the Grace K. Babson Collection of the Works of
Sir Isaac Newton, and the Material Relating to Him in the Babson Institute Library,
Babson Park, Mass., with an introduction by Roger Babson Webber (New York:
Herbert Reichner, 1950); a supplement compiled by Henry P. Macomber was pub¬
lished by Babson Institute in 1955.
3. An edition of the Principia with variant readings, undertaken by the writer
and the late Alexandre Koyre, with the assistance of Anne Whitman, is being pub¬
lished jointly by Harvard University Press and Cambridge University Press.
4. Conveniently available in a current reprint (New York: Dover, 1952).
5. Bentley’s sermons and Newton’s letters are reprinted, with a commentary by
Perry Miller, in I. Bernard Cohen and Robert G. Schofield (eds.): Isaac Newton’s
Papers & Letters on Natural Philosophy, and Related Documents (Cambridge:
Harvard University Press, 1958).
6. A textual comparison of all the editions of the Principia is available in the
forthcoming edition with variant readings (see note 3 supra).
7. For the development of the Queries, see Alexandre Koyre: “Les Queries de
VOpticks de Newton,” Archives Internationales d’Histoire des Sciences, 1960, 13:
15-29; also I. B. Cohen, Preface to the Opticks (note 4 supra), pp. xxxiii sqq.; intro,
to Isaac Newton’s Papers (note 5 supra), pp. 14 sqq.
8. Biot’s statement seems cautious compared to the extravagance of Livio Stec-
chini’s recent declaration that, “Whereas the first edition of the Principia (1687) is
essentially rationalistic in spirit and follows a positivistic method, theological preoccupa¬
tions dominate the second edition.” To my knowledge, this is the first time that a
book has been said to have been “dominated” by one aspect of an appendix added to
a new edition! Stecchini’s claim seems all the more absurd if read in the light of the
removal of God’s name from the text of the Principia, as it had appeared in the first
edition. Furthermore, Stecchini’s statement is so at variance with the facts (since
the method and subject matter and the form and style of the text are so nearly
identical in all editions) that one suspects that he has either never read carefully in
any edition or did not comprehend what he did read. In any event, Newton’s major
statement of “a positivistic method” does not appear in the first edition of the
Principia, but in the second, in the new final Scholium Generale, following the theo¬
logical section, in Newton’s declaration that “I have not been able to discover the
cause of those properties of gravity from Phenomena and frame no hypotheses.” In
534 4. HISTORICAL STUDIES

any event, “it is enough that gravity does really exist, and act according to the laws
which we have explained, and abundantly serves to account for all the motions or
the celestial bodies, and of our sea.”
Stecchini’s essay is a sort of historical apologia for Immanuel Velikovsky and nis
theory of catastrophes and it appears in a book intended to glorify Velikovsky and
to demolish his critics. It is typical of the lack of accuracy in this presentation as a
whole that Stecchini offers as “proof that Newton had become fixated on the religious
problem, but had not lost any of his intellectual flexibility, . . . that the few addi¬
tions that appear in the third edition of the Principia (1726), disclose that he came
to believe that God reveals himself not in the appearance of things but in the ways
of mankind.” This statement, taken from Cajori’s notes to the Principia, but without
the quotation marks of the original, does less than justice to Cajori, who wrote
(p. 670) that, “It is to be noted that Newton’s idea of God in the second (1713)
edition of the Principia is drawn largely from the ‘appearances of things, while the
interpolations printed in the third edition (1726) are taken more particularly from
the ‘ways of mankind.’” The phrase “ways of mankind’ occurs in the new material
printed in the third edition in 1726, but—as Cajori correctly points out (loc. cit.)
and as Stecchini ignores—“substantially this addition Newton prepared long before
1726, in fact, only six months after the publication of the second edition in 1713. It is
given in a list of corrections and additions to the second edition, which he sent to
Cotes; but this was not printed at the time.” I do not know how Stecchini concluded
that there are but “few additions that appear in the third edition of the Principia,
since this is far from the truth; perhaps he meant to say, “few additions that appear in
the General Scholium in the third edition. . . .”
Stecchini’s essay, entitled “The Inconstant Heavens,” appears on pp. 80-126 of
Alfred de Grazia, ed.: The Velikovsky Affair, the Warfare of Science and Scientism
(New Hyde Park, N.Y.: University Books, 1966). The discussion of the Principia
may be found on pp. 95-96. This book is in part a reprint of the September 1963
issue of the American Behavioral Scientist, but with much new material.
9. Quoted from Andrew Motte’s translation (1729; reprinted, London: Dawsons
of Pall Mall, 1968, with an introduction by I. B. Cohen), corresponding to p. 544 of
the Motte-Cajori version. For earlier versions of the General Scholium, see A. R.
Hall and Marie Boas Hall: Unpublished Scientific Papers of Isaac Newton (Cam¬
bridge: at the University Press, 1962), pt. IV, § 8.
10. I am not particularly concerned here with the evolution of the whole of the
General Scholium.
11. Quoted from the Motte version, p. 17.
12. This is my own adaptation of the Motte version of the Principia.
13. See, in particular, Galileo’s letter to the Grand Duchess Christina (1615),
translated into English with an introductory commentary in Stillman Drake: Dis¬
coveries and Opinions of Galileo (Garden City, N.Y.: Doubleday Anchor Books,
1957), pp. 173 sqq. Also Giorgio de Santillana: The Crime of Galileo (Chicago:
Univ. of Chicago Press, 1955).
14. Op. cit., p. 11.
15. See H. G. Alexander: The Leibniz-Clarke Correspondence (Manchester:
Manchester Univ. Press, 1956). For other editions of this famous work, see the
article by A. Koyre and I. B. Cohen (cited in note 22 below).
16. For an account of these annotations, see Michael Hoskin: “Clarke’s Notes to
Rohault’s Traite.” The Thomist, 1961, 24: 253—263; also G. Sarton: “The Study of
Early Scientific Textbooks,” Isis, 1948, 38: 137-148.
17. John Clarke: A Demonstration of Some of the Principal Sections of Sir Isaac
Newton’s Principles of Natural Philosophy (London: printed for James & John
Knapton, 1730), p. 24.
18. William Whiston: Sir Isaac Newton’s Mathematick Philosophy more easily
Demonstrated (London: printed for J. Senex & W. Taylor, 1716), p. 37.
19. This text is to be found among the MSS in the Portsmouth Collection in the
University Library, Cambridge (MS Add. 3965.542v).
20. This is not the only example of Newton having written a preface that would
be printed as if written by someone else. The second printing (or edition) of the
Commercium Epistolicum, of 1722, contained an anonymous “Ad Lectorem" which
proves to have been written by Newton; also a lengthy “Recensio Libri,” translated
into Latin from an English original that had been published in the Philosophical
Transactions (No. 342: Jan.-Feb. 1715), and which also proves to have been written
by Newton. This Commercium is allegedly a documentary report of a supposedly
international committee of the Royal Society, which investigated the charges of
I. BERNARD COHEN 535

plagiarism in the matter of the invention of the calculus; it is a vindication of Newton


and an utter condemnation of Leibniz, but proves to have been written by Newton
himself. On this subject see I. B. Cohen: “Newton and Recent Scholarship,” Isis, 1960,
51: 489-514, esp. n. 60.
21. See note 15 supra.
22. The text of this “Avertissement” in a number of different versions as found in
Newton’s MSS is printed with an extended commentary in A. Koyre & I. B. Cohen:
“Newton & the Leibniz-Clarke Correspondence, with notes on Newton, Conti, & Des
Maizeaux,” Archives Internationales d’Histoire des Sciences, 1962, 15: 63-126.
23. There are other examples, if needed, such as a section of “lnterpretationes
Sacrarum Literarum” in Newton’s Common Place Book, which is followed by a list
of references to texts of Scripture and their interpreters. See Isaac Newton: Theo¬
logical Manuscripts. Selected and edited with an Introduction by H. McLachlan
(Liverpool: at the Univ. Press), p. 140.
In the translation of the Principia by the Marquise de Chastellet (Paris: chez
Desaint & Saillant [,&] Lambert, 1759), vol. 1, p. 15, this sentence is translated as
follows: “lorsqu’on trouve done ces termes dans I’Ecriture, se seroit faire violence
au text sacre, si au lieu de les prendre pour les quantiles qui leur servent de mesures
sensibles, on les prenoit pour les veritables quantites absolues. . . .”
24. U.L.C., MS Add. 3965, fols. 23-26. The Latin original reads as follows:
“Haec omnia fusius explicare visum est ut Lector praejudieijs quibusdam vulgari-
bus liberatus [Claris del.] et distinctis principiorum Mechanicorum conceptibus im-
butus accederet ad sequentia. Quantitates autem absolutas et relativas ab invicem
seduld [replacing acriter] distinguere necesse fuit [replacing conatus sum] ed, quod
phaenomena omnia pendeant ab absolutis, Vulgus autem qui cogitationes a sensibus
abstrahere nesciunt semper loquuntur de relativis, usque adeo ut absurdum foret vel
sapientibus vel etiam [vel etiam replaces aut] Prophetis apud hos aliter loqui. Unde
et Sacrae literae et Scripta Theologorum de relativis semper intelligenda sunt, et
crasso laboraret praejudicio qui inde de rerum naturalium modbus philosophicis
[apparently replacing absolutis] disputationes moveret. [Perinde est ac si quis Lunam
(in Gen. 1) magnitudine non apparente sed absoluta [numerari del.] inter duo max¬
ima lumina numerari contenderet. [Gen. 1. del.] del.]”
The word “philosophicis” is written in above “absolutis,” but the latter is not
canceled. Either Newton forgot to cancel “absolutis” or he had not made a firm
decision about this proposed substitution.
A version and translation of this tract is to be found in John Herivel: The Back¬
ground to Newton’s Principia (Oxford: at the Clarendon Press, 1965), pp. 304 sqq.
25. On this subject see J. Edleston; Correspondence of Sir Isaac Newton and
Professor Cotes (London: John W. Parker, 1850), pp. xcv sqq.
Edleston, in this portion of the introduction to his edition of the Newton-Cotes
correspondence, gives a table of all of the lectures deposited by Newton, which had
been preserved in the University Library, Cambridge.
26. Motte-Cajori edition, p. 12.
27. The next sentence originally read: “Verumtamen disputando, idque partim ex
viribus quae sunt motuum verorum causae et effectus, partim ex modbus apparentibus
qui sunt motuum verorum differentiae, possumus aliquid nonnunquam colligere.” It
may be translated as follows: “But by arguing, and doing so partly from the forces
which are the causes and effects of true motions, partly from the apparent motions
which are the differences of the true motions, we can sometimes gather something.”
This sentence was changed by Newton to become the two sentences which are found
in the printed editions of the Principia (the same in all with one minor variation) in
the final paragraph of the Scholium to the Definitions (“Causa tamen non est prorsus
desperata. Nam argumenta desumi possunt, partim ex modbus apparentibus qui sunt
motuum verorum differentiae, partim ex viribus quae sunt motuum verorum causae
& effectus.”) and which are translated by Motte as follows: “Yet the thing is not al¬
together desperate; for we have arguments to guide us, partly from the apparent
motions, which are the differences of the true motions; partly from the forces, which
are the causes and effects of the true motions.”
28. The Mathematical Principles of Natural Philosophy. Translated into English
by Andrew Motte (London: printed for Benjamin Motte, 1729, 2 vols.), vol. 1, p. 17.
See note 9, above.
29. Two copies of this prospectus are to be found in the History of Science
Library in the Old Ashmolean Museum in Oxford.
30. In Robert Thorp’s translation, the second edition (London: printed by A.
Strahan for T. C. Adell jun. & W. Davies, 1802), p. 19, this sentence is translated:
536 4. HISTORICAL STUDIES

“Upon which account they violate that accuracy of language, which ought to be held
sacred, who interpret those words of the measured quantities.”
31. Indeed, the evidence is quite strong that—at least in large measure, if not
entirely—Florian Cajori did not make his revision of Motte’s translation by a con¬
tinuous confrontation of Newton’s original Latin text and the Motte translation. I
have assembled a number of examples that show that even where Motte used the
second rather than the third edition, Cajori made no alteration; at other times, Cajori
altered Motte’s version (which happened to be based on the second edition) without
taking account of the fact that Newton himself had made an even better change in
the third edition. (Incidentally, we do not know why certain portions of Motte’s trans¬
lation are based on the second rather than the third edition; perhaps he had begun
to make this translation prior to 1726, when the third edition of the Principia was
published.) For these examples, and other material concerning the Cajori and Motte
versions of Newton’s Principia, see I. B. Cohen: “Pemberton’s Translation of New¬
ton’s Principia, with Notes on Motte’s Translation,” Isis, 1963, 54: 319-351, esp. pp.
341 sqq., 348 sqq. It should be added that we do not know how much of the final
revision of Cajori’s version, which was published posthumously, was done by the
editors rather than by Cajori himself.
32. An example of such an attempt to make Newton read more like a modern
book than the original may warrant is found in the treatment of the First Law of
Motion. Whereas Motte has Newton say: “Every body perseveres in its state of rest,
or of uniform motion . . . ,” Cajori’s revision reads, “Every body continues in its
state of rest, or of uniform motion. . . .” The word “persevere” may sound a little
too “active” for modern ears, but it is closer to Newton’s '"persever are" than the
more “passive” word “continues.”
33. Quoted from Motte’s translation, corresponding to the Motte-Cajori version,
p. 417.
34. Quoted from the first edition, Philosophiae Naturalis Principia Mathematica
(London: printed by Joseph Streater for Sam. Smith & other booksellers, 1687),
p. 415. A facsimile reprint of the first edition is available from William Dawson &
Sons, London.
The first part of this Corollary 5 is much longer in the first edition. The changes
are displayed in the critical edition of the Principia, in press (see note 3 supra).
35. I refer only to the actual text of the Principia, that is, to Newton’s text. In
the second edition, there is a considerable discussion of God at the end of Cotes’s
preface (Motte-Cajori version, pp. xxxi sqq.), which is also reprinted in the third
edition.
36. Newton’s interleaved copy of the first edition of the Principia is in the Ports¬
mouth Collection in the University Library, Cambridge. Newton also kept an an¬
notated copy of the first edition, which is at present among Newton’s books in the
Trinity College library, Cambridge. There are in these two libraries, also, similar
interleaved and annotated copies of the second edition. In the edition of the Principia
with variant readings, now in press (see note 3 supra), all of the annotations from
these four copies are included among the “variae lectionesas are also the variants in
the final manuscript used by the printer for the first edition.
37. This review appeared in the Acta Eruditorum for June 1688, pp. 305-315.
38. It is just such a discussion that one finds in the “Leibniz-Clarke corre¬
spondence,” in which Samuel Clarke was the spokesman for Newton. For the degree
of Newton’s participation in Samuel Clarke’s letters to Leibniz, see the article by A
Koyre and I. B. Cohen cited in note 22 supra.
39. Quoted from Newton’s letter to Bentley, 10 December 1692, Isaac Newton’s
Papers . (cited in note 5 supra), p. 280; also available in the Royal Society’s
edition of the Correspondence of Isaac Newton, vol. 3, 1688-1694 (Cambridge-
published for the Royal Society at the University Press, 1961).
40\,Th®firsf edition of the Principia ends abruptly with a discussion of comets at
the end of Book III. Newton had evidently planned a “Conclusio” which he wrote out
in two drafts but never completed for publication. This has been published together
with some unused portions of the Preface, by Professors A. R. Hall and Marie
Boas Hall in their volume of Unpublished Scientific Papers of Isaac Newton (cited
m note 9 supra), pt. IV, § 37. In neither this part of the Preface, nor the draft of
the Conclusion, nor in the Preface actually printed by Newton does he refer to
theological questions or mention God by name.
vt ? xr *n^eres^P§ t0 observe that, toward the end of the discussion of Proposition
u j ^ew^°/1 discussed some beneficial aspects of comets, even stating the likeli¬
hood that the tails might turn into vapor which, “rarefy’d and dilated, may be
I. BERNARD COHEN 537

at last dissipated, and scatter’d through the whole heavens, and by little and little be
attracted towards the Planets by its gravity, and mixed with their atmosphere. For
as the seas are absolutely necessary to the constitution of our Earth, that from them,
the Sun, by its heat, may exhale a sufficient quantity of vapours, which being
gather’d together into clouds, may drop down in rain, for watering of the earth, and
for the production and nourishment of vegetables; or being condens’d with cold on
the tops of mountains, (as some philosophers with reason judge) may run down in
springs and rivers; so for the conservation of the seas, and fluids of the Planets,
Comets seem to be requir’d, that from their exhalations and vapours condens’d, the
wastes of the Planetary fluids, spent upon vegetation and putrefaction, and con¬
verted into dry earth, may be continually supplied and made up. For all vegetables
entirely derive their growths from fluids, and afterwards in great measure are turn’d
into dry earth by putrefaction; and a sort of slime is always found to settle at the
bottom of putrefied fluids. And hence it is, that the bulk of the solid earth is con¬
tinually increased, and the fluids, if they are not supplied from without, must be in a
continual decrease, and quite fail at last. I suspect moreover, that ’tis chiefly from
the Comets that spirit comes, which is indeed the smallest, but the most subtle and
useful part of our air, and so much required to sustain the life of all things with us.”
Quoted from Motte’s translation (see note 9 above), corresponding to the Motte-
Cajori version, pp. 529-530. This passage is essentially the same in all three editions.
41. U.L.C., MS Add. 3965, fol. 152v. The Latin original reads:
“Vapores autem qui ex sole stellis fixis et caudis Cometarum oriuntur, in Planetis
condensari videntur & converti primo in aquam & humores deinde in limum & lutum
& sales & arenam et substantias animalium vegetabilium et mineralium. Sic perpetua
fit rerum omnium vicissitudo & solus omnium Dominus manet immutabilis qui
consilio suo et voluntate omnia ordine optimo (per Ministros) disponit, removendo
Stellas fixas ad commodas distantias ne cadant in se invicem, locando Planetas in
orbibus concentricis in eodem piano cum modbus conspiranti(bus) & cometas in
orbibus excentricis & planis varijs cum modbus contrarijs, & tribuendo unicuique
justam motus determinationem et velocitatem qua Orbes designatos describere possint,
& rerum omnium species in seminibus primis formando sub initio. Quibus legibus et
qua radone systema rerum olim a Deo positum conservatur & perseverat philosophicum
est exponere: at quomodo sapiendssimus hie rerum ordo a consilio et voluntate con-
ditoris oriri potuit, Philosophia naturalis non docet [from non pertinet ad Philo-
sophiam naturalem exponere].”
The canceled passage reads:
“at quomodo sapiendssimus hie rerum ordo a materia sola et motu aut a rerum
Natura necessario agente aeterna infinita sapientissima potendssima & summe per-
fecta aut a caeco quocunque fato sine causis finalibus atque adeo sine rerum omnium
Domino Deo oriri potuisset exponere a Philosophia alienissimum est.”
The phrase “per Ministros” could refer to agents, servants, attendants, mediators,
and so on. By “conspiring” motions Newton means here (as in Law II in the Prin-
cipia) motions in the same sense or direction. The translation of vicissitudo is given
as “interchange,” rather than “conversion” or “transmutation,” since Newton used
conversio and transmutado for the latter.
42. The topic of Newton’s theological or religious views in relation to his concept
of the “System of the World” is vast, and in this sense the word “complete” as I have
used it may be misleading. But I have given in this article a “complete” account of
the subject as it appears in the actual texts of the several printed editions of the
Principia.
43. The question of the different editions and translations is a very complex one,
and I shall not discuss it here. It is discussed in a separate publication dealing with
the relation of the various manuscript copies to one another and to the printed
editions.
44. Quoted from Cajori’s revised version, p. 566. We do not know who translated
this work.

APPENDIX I: NEW LIGHT ON THE FORM OF


DEFINITIONS I-II & VI-VIII

All but three of the Definitions at the beginning of the Principia con¬
tain the phrase “est mensura ejusdem” or “est ipsius mensura”—trans-
538 4. HISTORICAL STUDIES

lated by Motte, and repeated by Cajori, as “is the measure of the same.”
Thus:

[Def. I.] The quantity of matter is the measure of the same, . . .


[Def. II.] The quantity of motion is the measure of the same, . . .
[Def. VI.] The absolute quantity of a centripetal force is the measure of
the same, . . .
[Def. VII.] The accelerative quantity of a centripetal force is the measure
of the same, . . .
[Def. VIII.] The motive quantity of a centripetal force is the measure of
the same, . . /

I believe one will look in vain in the literature concerning the Principia
for an explanation of this odd phrase.
With regard to centripetal force, the presentation I have given of the
Scholium to the Definitions helps us at once to see Newton’s intention. The
clue is given by the general distinction we have seen Newton make in this
Scholium between “the quantities themselves” and “sensible measures of
them.” With respect to “the quantity” of centripetal force, Newton proposes
to us a set of three different “sensible measures.” We must not—to quote
and paraphrase Newton again—confuse such a “real quantity itself” as
centripetal force and its “vulgar measures.” Each of the three “sensible”
measures of centripetal force has its own purpose. That “sensible measure”
which is (Def. VIII) “proportional to the motion [i.e., momentum] which it
generates in a given time” is the one familiar today, in the common form
of the Second Law of Motion (F = k • mA or F = k- m AV/At = km •
AV/At). In the Principia, however, Newton most frequently uses that
“sensible measure” of centripetal force which is (Def. VII) “proportional
to the velocity which it generates in a given time,” i.e., the “accelerative”
measure, or “accelerative force” (f = k ’AV/At); thus he determines the
acceleration itself / (to use our present terminology), without introducing
the mass or “quantity of matter.” An example he gives (and which is dis¬
cussed following) is “accelerative gravity,” or the “measure” of “gravity”
(which is a “centripetal force”) “proportional to the velocity which it
generates [in all bodies] in a given time,” or “g.”
In the case of “quantity of matter,” Newton says explicitly: “It is this
quantity that I mean hereafter everywhere under the name of body or mass.”
This quantity is known to us, furthermore, “by the weight of each body,
for it is proportional to the weight, as I have found by experiments on
pendulums, very accurately made, which shall be shown hereafter.” Thus
ends the amplification of Definition I. Newton here presents “mass”
(“massa”) or “quantity of matter” (“quantitas materiae”) as a “sensible
measure” of matter that “arises from its density and bulk conjunctly.” He
thus neatly and deftly avoids any discussion of what matter itself may be
in an absolute sense, or of what its “essential” measure might possibly be.
I. BERNARD COHEN 539

The “mass” he defines is “measurable” by the volume and density of the


body in question.2
I believe that one of the most revolutionary aspects of the Principia is
the way in which Newton uses “quantitas materiae” as the “measure” of
matter and distinguishes this “measure” from weight, which is an accidental
property of any given sample of matter.3 Thus, in the discussion of Definition
VIII, Newton explains that the “accelerative gravity” at any one place
near the surface of the Earth is “in all bodies . . . the same,” so that “the
motive gravity or the weight is as the body”; but if we were to ascend into
“higher regions of space,” we would find the weight and the corresponding
“accelerative gravity” both diminished in the same proportion, although
there would be no change whatever in the “quantitas materiae” or “corpus”
Even on Earth, as we move from one latitude to another, the “accelerative
gravity” and hence the weight of a body will be found to vary, even though
its “quantitas materiae” does not. Thus could Newton explain the change
in the period of a seconds pendulum observed by Halley when he went from
London to St. Helena and by Richer when he went from Paris to Cayenne.
Newton assumed that it was the “accelerative gravity” (and hence the
weight) that changes from place to place on Earth and not the mass or
“quantity of matter,” because the variation in distance from the axis of
rotation would affect the “accelerative gravity,”4 while there was no
scientific ground for supposing that this same factor could possibly alter the
“quantity of matter” in the pendulum bob or in any other object.
In other words, Newton in the Principia saw clearly that it is folly in
science to deal with “absolute quantities”5 and that scientific discourse
must be based on “measurable quantities.” “Matter” has at least two
“measurable quantities”: mass and weight. One (mass) is an invariant
measure,6 while the other (weight) is an accidental or local measure and
hence varies according to the local conditions.
No one prior to Newton7 had had any true insight into the significance
of the Galilean presentation that all bodies at any one given place will, in
the absence of resistance, fall freely with the same acceleration “g” (or, to
use Newton’s terminology, will have the same “accelerative gravity”). The
reason is that Newton appreciated that Galileo’s result implies that weight
as a “motive force” is always proportional to “quantity of matter” for all
bodies at any one place. But this interpretation of Galileo’s experiments
and conclusions depends on two factors (both exclusively Newton’s): the
Second Law of Motion and the equivalence of two “sensible measures” of
matter. The “quantity of matter” presented in Definition I, as “its measure,”
proves in Definition III to be also the measure of a body’s resistance to
being accelerated or to undergoing a change in state (“whether of resting
or of moving uniformly forward in a right line”). But “quantity of matter”
is also the factor in a body (Prop. VI, Book III) that determines its
540 4. HISTORICAL STUDIES

“weight” toward any given planet, since “all bodies gravitate towards
every Planet; and ... the weights of bodies towards any [the same] Planet,
at equal distances from the centre of the Planet, are proportional to the
quantities of matter they contain.” Because there are two measures of
matter—mass and weight, or, resistance to acceleration (or change of
state) and power to call forth a weight-force—Newton quite naturally had
to find the relation between them. In fact, at any one and the same place
(as Newton shows in Prop. VI, Book III, and states in the discussion of
Def. I), the “quantity of matter” is always proportional to the gravitational
“weight.”
The meaning of Galileo’s experiments thus becomes merely the equiv¬
alence of the two concepts of matter, or the constant proportionality of the
measures of matter: mass and weight.8 And at once it follows that any
change in the acceleration of free fall (or in “accelerative gravity”) must
produce a corresponding change in weight in all bodies (or all masses),
whether that change arises from a move from one part of the Earth to
another,9 or from one part of space to another.10
In Book II of the Principia, Newton proves (Cor. 1 to Prop. XXIV) that
if two pendulums have their “centres of oscillation . . . equally distant from
the centre of suspension,” it must follow from the equality of periods in
any pendulums that “the quantities of matter in each of the [pendulous]
bodies are as the weights.” On this basis, he could report (Prop. VI, Book
III) that quantities of matter (mass) and weight were found to be propor¬
tional in the following substances: gold, silver, lead, glass, sand, common
salt, wood, water, and wheat.11 He concluded, by induction, that this result
may be applied to all varieties of matter, and that this result would have
been the same no matter where the tests had been made. These experi¬
ments, Newton pointed out explicitly, were only intended to prove what
had been, “for a long time, observed by others,” namely, the descent in
free fall of “all sorts of heavy bodies” from equal heights “in equal times.”
As Newton puts the matter, “. . . and that equality of times we may dis¬
tinguish to a great accuracy, by the help of pendulums.”
It is often said that what Newton proved by his pendulum experiments
is that mass is proportional to weight. Such a statement is grossly mislead¬
ing. Newton only devised a more accurate form of experiment to show how
exact Galileo’s result was, that all bodies fall freely with the same accelera¬
tion at any one place. His interpretation was not to hold that “mass” is a
universal entity or property of matter; rather, he showed that one par¬
ticular concept of quantity of matter” he chose as a “sensible measure”
(proportional to the resistance to being accelerated) also is proportional to
another measure or quantity of matter” which determines the gravitational
attraction or weight towards any planet.”12 Without such concepts of
sensible measure, it would not have been possible to have a related law
of universal gravitation (or, law of gravity) and Second Law of Motion,
I. BERNARD COHEN 541

and thus to explain both the constancy of the acceleration of free fall for
all bodies at one place and the variation of this acceleration from place
to place.13

NOTES
1. Defs. I-II assert that the “measure” in question “arises from” the “density
and bulk conjunctly” (Def. I) or the “velocity and quantity of matter conjunctly”
(Def. II). But Defs. VII-VIII present the “measure” as “proportional to” either
“the velocity” (Def. VII) or “the [quantity of] motion” (Def. VIII) which is gen-
erated “in a given time.” Def. VI is somewhat different in style, for here the “meas¬
ure” is “proportional to the efficacy of the cause that propagates it from the centre,
through the spaces round about.”
2. Many authors have criticized Newton for introducing “circular” definitions;
for, if mass be measured by volume and density, is not density merely mass per
unit volume? The answer is: not necessarily. In Newton’s day “density” and “specific
gravity” were used interchangeably, and the specific gravity of many objects (or of
samples of objects) could be determined directly by simple experimental devices
such as for weighing the object in air and in water.
An example of the usage of “specific gravity” in Newton’s day occurs in Thomas
Salusbury’s translation of Galileo’s Discourse . . . concerning the natation of bodies
upon, or submersion in, the water (London: printed by William Leybourn, 1663;
reprinted with intro, and notes by Stillman Drake, Urbana: Univ. of Illinois Press,
1960). Lemma I reads: “The absolute Gravities of Solids, have a proportion com¬
pounded of the proportions of their specificall Gravities and of their Masses.”
Here (see p. xxv), the word “mass” is Salusbury’s translation of Galileo’s “mole”
and means volume or bulk.
In Newton’s Opticks, Book Two, Part III, Prop. X, he writes of “the Densities
of Bodies estimated by their Specifick Gravities.” In the accompanying table of
refraction in various substances, Newton describes the third column as: “The
Density and specifick gravity of the Body.”
3. Momentum, “quantitas motus,” is therefore based on mass (“quantitas
materiae”) rather than weight, or bulk (“moles”). And inertia (the ability to
resist any change in “state”—whether of motion or of rest) is also proportional to
“body” (“corpus”), which is another “name” for “quantity of matter” or “mass,”
according to Def. III.
4. Newton attributed this effect to two causes. One arises from the shape of
the Earth, an oblate spheroid, while the other is a pure effect of latitude in relation
to distance from the axis of rotation.
5. The Scholium following the Definitions is devoted to an exposition of the
grounds for basing physics on “sensible measures” of time, space, and motion
rather than on their “absolute” aspects, which are not detectable (save for rotation).
6. I mean here, of course, “invariant” in Newtonian terms, not Einsteinian.
7. Of all Newton’s contemporaries and predecessors, only Huygens possibly had
any real insight into the concept of mass.
8. One is tempted to state, however anachronistically, that Newton proved by
experiment the “equivalence” of “gravitational mass” and “inertial mass.” I believe
there can be no doubt that Newton was aware that the “quantity of matter” as a
“measure” of matter that determines (or “is proportional to”) its resistance to
being accelerated is not by definition alone equivalent to that “measure” of matter
that determines its “weight” or the force acting on it in a given gravitational field;
he obviously had the insight that this equivalence is shown by Galileo’s experi¬
ments, or by his own pendulum experiments.
9. In other words, Newton—to use our modern terminology—appreciated both
the constancy of the acceleration of free fall (same value for all bodies at any
one place in the universe: Galileo’s result), and its variation (unknown to Galileo,
and revealed by Halley’s and Richer’s expeditions).
10. It is most important to recognize that Newton dealt with the variation in
weight (and in the ever-proportional “accelerative gravity” or acceleration of free
fall) occurring as one would move out in space from the Earth’s surface. See the
discussion of Def. VIII. To use (anachronistically) our modem algebraic equations,
he saw that W a g or that W = k • mg, and that “g” varies from place to place.
542 4. HISTORICAL STUDIES

11. “And therefore the quantity of matter in the gold (by cor. 1. and 6. prop.
24. book 2.) was to the quantity of matter in the wood, as the action of the motive
force (or vis motrix) upon all the gold, to the action of the same upon all the wood;
that is, as the weight of the one to the weight of the other. And the like happened
in the other bodies. By these experiments, in bodies of the same weight, I could
manifestly have discovered a difference of matter less than the thousandth part
of the whole, had any such been.” Prop. VI, Book III.
12. It is most significant that in stating Prop. VI, Book III, Newton writes of
“the Weights of bodies towards any Planet” C'pondera eorum in eundem quemvis
Planet am”), since this usage implies that the force attracting bodies toward any
planet (or toward the Sun) is the same as the terrestrial force of gravity or weight.
In a Scholium just prior to Prop. VI, Newton says: “The force which retains the
celestial bodies in their orbits, has been hitherto called centripetal force. But it
being now made plain, that it can be no other than a gravitating force, we shall
hereafter call it gravity.” Newton had already shown in Prop. IV that the force by
which the Moon “is continually drawn off from a rectilinear motion, and retained
in its orbit,” is “the force of gravity” (i.e., terrestrial gravity) so that “the Moon
gravitates towards the Earth.” Hence, in this Scholium, Newton can use the name
“gravity” in relation to “the cause of that centripetal force, which retains the Moon
in its orbit” and he can conclude that this “cause . . . will extend it self to all the
Planets by rule 1. 2. and 4.”
13. I have developed these ideas further in my article: “Newton’s Second Law
and the concept of force in the Principia,” The Texas Quarterly, vol. 10 (1967), pp.
127-159. A corrected and expanded version of this paper will appear in my Newton’s
Natural Philosophy: Inquiries into Newton’s Scientific Work and Its General Environ¬
ment (in process).

APPENDIX II: AN UNPUBLISHED STATEMENT OF NEWTON’S


ON HIS SYSTEM OF THE WORLD

Among Newton’s manuscripts, there is a draft entitled “An Account of


the Systeme of the World described in M1' Newton’s Mathematicall Prin¬
ciples of Philosophy,” which deals directly with the problems I have been
discussing.1 The handwriting indicates a date of composition in the very
early 1690’s, perhaps around 1692, and thus within some three to five
or more years after the publication of the Newtonian “Principles of
Philosophy, or the Principia (1687): Mathematical Principles of Natural
Philosophy. On more than one occasion, Newton wrote about himself in
the third person, notably in the controversy with Leibniz.2 It may be ob¬
served that Newton uses an abbreviated form of the title of his great treatise,
the “Mathematicall Principles of Philosophy”; often he writes of his (or
“Newton’s”) “Principles of Philosophy” or even just “Principles.”3
Since the document printed herewith is a fragment, isolated from the
remainder of Newton s manuscripts, we are given no key as to his purpose
in writing it. Possibly it was a statement prepared at the time of his
correspondence with Richard Bentley in 1692/3 in relation to the latter’s
sermons or Boyle Lecture on A Confutation of Atheism from the Origin and
Ftame of the World.4 For on this occasion, Newton organized and ex¬
pounded his views on the necessity of the existence of God, as revealed by
the celestial dynamics of the Principia.
In the document printed below, Newton begins with the bold declaration
I. BERNARD COHEN 543

that “the main Question” in “determining the true systeme of the world” is
whether the Earth remains at rest or is in motion, that is, whether the uni¬
verse is geostatic or heliostatic.5 Of primary interest is Newton’s immediate
declaration that the Scriptures have been wrongly introduced into the
discussions of this question, since the Scriptures were not composed
“in the language of Astronomers,” but of the “common people to whom
they were written,” which is the same opinion that we have seen Newton
express at the Scholium on time and place, following the Definitions in the
Principia. Less convincing is Newton’s own interpretation of the passages he
cites: that they were intended “to tell the vulgar in their own dialect,” that
God had so firmly made fast on its foundations “the great continent of
Asia Europe & Africa,” that it could not move around on the Earth—in the
manner nowadays being studied in the light of Wegener’s continental-drift
theory. Newton concludes that the references in Scripture to “immoveable-
ness” do not, therefore, apply to the whole globe as a possibly moveable
entity, but only to “its parts one amongst another.”
In paragraph II, Newton marshalls the principles of the new physics to
show why there can be no simple experimental proof that the Earth does
or does not move—so long as the motion is “eaven” and not a series of
uneven “joggs.” Here Newton attacks the arguments of “common people”
and “mathematicians” who don’t really understand “the principles of
Mechanicks.” Originally he had referred to such writers as “mathematicians
who have skill enough only to write Collections.” But who it was that
Newton had in mind I do not know. Most likely it was Robert Hooke,6
and it was in fact with Hooke that Newton had had a most interesting and
significant exchange of letters on the effects of the Earth’s motion. The
occasion had been in 1679, when Hooke had initiated a “philosophical”
correspondence with Newton. Hooke wanted to discuss the motion of
planets, but Newton had attempted to shift the subject by introducing “a
fansy of my own about discovering the earth’s diurnal motion.”7 At issue
was the actual path of descent of an object let fall from a tower on a
moving Earth; if the object could penetrate to the Earth’s center, its path
in absolute space would be an ellipse, but with respect to the Earth it would
be a curve of a special sort which Newton drew.8 From this correspondence
Newton was led to consider the paths of bodies under various centripetal
forces and, in particular, the elliptical paths of planets in accord with
Kepler’s laws under the action of an inverse-square law of force.
In the Principia, Newton had shown that experiment would never enable
one to distinguish between the dynamically equivalent states of uniform
rectilinear motion and of rest; thus one could never know by experiment
whether the solar system as a whole were at rest on in uniform rectilinear
motion. But it is different for rotation. For instance, there are a number of
definitely observable effects of the Earth’s rotation, among them the
variation in the period of a pendulum from one latitude to another, the
544 4. HISTORICAL STUDIES

shape (prolate spheroid) of the Earth, and the precession that depends on
the rotation of the Earth in relation to the inclination of the axis and to
the Earth’s shape. Clearly, then, the declaration at the end of paragraph II
cannot refer to the Earth’s rotation. When Newton says that terrestrial
experiment alone can never determine whether the Earth is at rest or in
motion, he must have had in mind the orbital or annual motion and not the
daily rotation. Since the uniform translation of the solar system as a whole
was never a major theological or philosophic issue, Newton must have been
referring to the orbital motion of the Earth. No doubt he had in mind the
revolution of the Earth in its orbit; the observed effects are grossly the same
for a fixed terrestrial observer on an Earth at rest and for that same fixed
observer on an Earth either in linear uniform motion or in its annual orbital
motion.9
The following manuscript, here printed for the first time, ends in the
middle of a sentence. A portion of each page has been destroyed and the
editorial restoration of missing words or parts of words is presented within
angle or “claw-hammer” brackets. I have used “corners” L J to indicate
insertions made by Newton in his own text, and I have indicated deletions
and other changes within square brackets (my editorial comment, as “del.”
for deleted, being in italics). The only change I have made is to substitute
curved braces { } for Newton’s own square brackets, which he tended to
use in at least three ways: (1) as we would use parentheses or brackets,
say within quotations, (2) to show quotations, and (3) to mark off passages
for deletion.

An Account of the Systeme of the World described in M‘ Newton’s


Mathematical! Principles of Philosophy.

i
Scripture abused
In determining the true systeme of the world the main
to prove the im- Question is whether the earth do rest or be moved. For
moveableness of
the [earth del.] deciding this some bring texts of scripture, but in my
globe of yc [Eart
del.] Earth. opinion misinterpreted, the Scriptures speaking not in the
language of Astronomers (as they think) but in that of ye
common people to whom they were written. So where tis
“ Psal 93.2 & 96.10.
said thata God hath made ye round world so fast that it
cannot be moved, the Prophet intended not to teach
[*> Psalm 98.8. del.] Mathematicians the spherical figure [of the whole del.] &
immoveableness of the whole earth & sea in the heavens
but to tell the vulgar in their own dialect that God had
made the great continent of Asia Europe & Africa so fast
upon its foundations in the great Ocean that it cannot be
moved therein after the manner of a flo|_ajting Island. For
this Continent was the whole habitable world anciently
I. BERNARD COHEN 545

b Strabo Geog. 1. 1. known & by ye ancient eastern nations was accountedb


p. 2, 4.
«Prov. 8.27. Job. round or circular as was also thec sea encompassing it.
a. vo

.8.
Job. 38.18. Psal. I& this earth & sea they accounted flat as if ye sun moon
50.1.
& stars ascended out of ye ocean at their rising & went
down into it again at their settingj This Continent is the
cJob. 28. 24 &
37.3. Psal. 46.9. & world or earth usually mentioned in scripture & there
72.8.
f Psal 74.17 described to bed broad & to havee ends orf borders,
[-’ Prov. 8.27 del.] [& del.] [that is [gdel.] circular onesj whose center some
placed in Egypt others at Delphos, others at Jerusalem.
And this world the Prophets consider as established in
the Ocean upon sure & immoveable foundations at ye
first creation. The heavens were of old & the earth
standing out of ye water & in the water {that is in the
midst of the Ocean like an Island} by the word of God.
2 Pet. 3.5. Thou Lord in the beginning hast laid the
foundations of the earth & the heavens are the work of
thine hands Psal 102.25. Prov. 8.29. Where wast thou
when I laid the foundations of the earth. Declare if thou
hast understanding who hath laid the measures thereof or
who hath stretched <Cthe> line over it. Whereupon [from or
whereupon] are the foundations thereof <fix’d> or who
hath laid the corner stone thereof, when the starrs <of they
morning praised me together, &c. Job 38.4. [[When he set a
circle upon the face of the deep {that is formed it circular
about the earth}—when he appointed the foundations of the
earth, then was / by him. Prov. 8. 27, 29j del.] The earth
is the Lord’s & all that therein is the compos of the world
& they that dwell therein. For he hath founded it upon the
seas & established it upon the floods Psal 24. 1, 2 & 136.6.
IThou hast laid the foundation of the round world Psal.
89.12. When he set a circle upon the face of the deep
{that is, formed it circular about the earth}—when he
gave to the sea his decree that ye waters’ should not pass
his commandm*, when he appointed the foundations
of the earth, then was I by him. Prov. 8.27, 29j He
laid the foundations of the earth that it never should
move at any time: Thou encompassedst it wth the deep like
as with a garment Psal. 104.5. So then the round world
spoken of in scriptures is such a world as hath foundations
L& is founded in the watersj & by consequence ’tis not the
whole globe of the Earth & Sea but only the habitable dry
land, ffor the whole Globe hath no foundations, but this
[habitable)world is founded in the seas. And since this
world by reason of the firmness of its foundations is said
546 4. HISTORICAL STUDIES

in scripture to be immoveable this immoveableness cannot


be of ye whole globe together, but only of its parts one
amongst another & [originally it] signifies nothing more than
that those parts are firmly compacted together so that the
dry land or Continent of Europe Asia & Africk cannot be
moved upon the main body of ye globe on wch tis founded,
ffor this immoveableness of ye earth is opposite to that it’s
motion spoken of in Job. He removeth the mountains &
they feel not when he overthroweth them in his wrath:
He removeth the earth out of her place that the pillars
thereof do shake Job. 9.6.
There is another sort of arguments against the motion
II of ye whole earth taken from or senses, as if the earth
Mathematicks
abused to prove could not be moved wthout or being many ways sensible
the Globe of the
Earth immoveable of its motion. But this way of arguing proceeds from want
of skill & judgment in Mathematical things, & therefore is
insisted upon only by the common people & such [originally
some] [practical del.] mathematicians |as understand not so
much as the principles of Mechanicksj [who have skill
enough only to write Collections, del.] Were the earth moved
uneavenly by joggs such motion would be easily perceived,
but an eaven motion such as the earth’s is supposed, ought
to be imperceptible, ffor in any systeme of bodies the
motions of ye bodies one amongst <anoth>er are the same
whether the systeme rest or be <move>d on uniformly, as
is mathematically demonstrable. So the motions of all things
in a ship are found the same whether the ship rest or be
under sail. In both cases things fall perpendicqlarly down
by the mast & projectiles fly alike towards all quarters.
Nor can a blinded Marriner tell whether the ship move
fast or slow or not at all. And there is the same reason of
the System of the earth sea & air with the things therein.
We cannot tell by or senses whether they all rest or move
on eavenly together.
III Such arguments as these being insufficient to determin
Accurate skill in
Geometry & Mech- the Question, ’tis fit we should lay aside these & the like
anicks requisite to
decide the Ques¬ vulgar prejudices & have recourse to some strickt & proper
tion.
way of reasoning. Now the Question being about motion
is a mathematical one & therefore requires skill in
Mathematicks to decide it. And seeing it is difficulter to
argue demonstratively about magnitude & motion together
then about magnitude alone, there is greater skill required
here then in pure Geometry so that none but able Mathe¬
maticians may pretend to be competent judges of this
I. BERNARD COHEN 547

matter. The great difficulty of this part of Mathematicks


seems to be the reason that ye Ancients made but little
progress in it. In this last age since the revival & [from of]
advancement of these studies, some able Mathematicians
as Galileo & Hugenius have carried it on further then ye
Ancients did. Mr Newton to advance it fur enough for his
purpose has spent the two first of his three books in
demonstrating new Propositions about force & motion
before he begins to consider the systeme of the world.
Then in his third Book he teaches that systeme from the
Propositions demonstrated in the two first. The designe
of this <pape>r is to give you an account of this
Systeme <& re>ferr you for the Demonstrations thereof
to the <Book its > elf or to the judgment of such Mathe¬
maticians as have perused it [The final three words are
at the top of a new page; there is no end punctuation.]

NOTES
1. Portsmouth Collection: University Library, Cambridge, MS Add. 4005, Sec.
7, fols. 39-42. This MS is written entirely in Isaac Newton’s hand.
2. Newton not only wrote out in his own hand the drafts and final text of the
“impartial report” of the Royal Society’s committee to investigate the possible claim
of Leibniz to have been a “first” (or independent) discoverer of the calculus (and
hence innocent of any charge of plagiarism); he also wrote a review of the report itself
(the Commercium Epistolicum), which he published in English in the Philosophical
Transactions and then printed in a Latin version as an introduction to the next
edition of the Commercium. The authorship of Newton was revealed only in the
nineteenth century; see Augustus De Morgan: “On the authorship of the account of
the Commercium Epistolicum, published in the Philosophical Transactions,” Phil¬
osophical Magazine, vol. 3 (1852), pp. 440-444. Newton’s article was entitled: “An
account of the book entitled Commercium epistolicum Collinii & aliorum, De analysi
promota, published by order of the Royal Society, in relation to the dispute between
Mr. Leibnitz and Dr. Keill, about the right of invention of the new geometry of
fluxions, otherwise call’d the differential method,” Phil. Trans., vol. 29 (1715), pp.
173-224. The MS drafts of the Commercium and the review, both in Newton’s
hand, may be found in U.C.L. MS Add. 3968.
3. “Principles of Philosophy” recalls to mind at once the title of Descartes’
treatise (Principia philosophiae), which Newton’s superseded. On these titles, see
my article: “Newton in the light of recent scholarship,” Isis, vol. 51 (1960), pp.
489-514.
4. Newton’s correspondence with Bentley may be found in vol. Ill of the Royal
Society’s edition of the Correspondence of Isaac Newton. The first printing of these
letters is reproduced in facsimile, with a facsimile of Bentley’s sermons on the
“frame of the world,” together with a commentary by Perry Miller, in I. B. Cohen
& R. E. Schofield (eds.): Isaac Newton’s papers & letters on natural philosophy
(Cambridge: Harvard Univ. Press, 1958).
5. Newton is concerned only with the question of the rest or motion of the Earth,
not with the identification of the center of the “universe” or of the “world.”
6. Robert Hooke had, in fact, edited a set of “Collections” for the Royal Society
and was the author of a work entitled An attempt to prove the motion of the Earth
from observations (London: printed by T.R. for John Martyn, 1674). But Hooke—
at least in this work—sought for evidence in the possible annual parallax of the
fixed stars and not in the results of terrestrial experiments.
7. Newton’s letters to Hooke, and Hooke’s replies, may be found in the Royal
Society’s edition of The correspondence of Isaac Newton, vol. 2 (Cambridge: at the
548 4. HISTORICAL STUDIES

University Press, 1960), esp. p. 301; see also Newton’s comment on this episode
in a letter to Halley of 20 June 1686, ibid., p. 438. For the significance of this
episode, see D. T. Whiteside: “Newton’s early thoughts on planetary motion: a fresh
look,” British Journal for the History of Science, vol. 2 (1964), pp. 117-137, esp. pp.
131 sqq.
8. For the actual curve, see J. A. Lohne: “The increasing corruption of Newton’s
diagrams,” History of Science (Cambridge), vol. 6 (1967: actually fall 1968), pp.
69-89, esp. § 3.
9. That is, for gross observations, such as the dropping of weights from towers
as seen by naked-eye observers, there are no visible effects of the Earth’s revolution
in its annual orbit because this “eaven” motion—for so short a time—is not per¬
ceptibly different from a uniform linear motion.
THE LAW OF INERTIA: Some Remarks on Its
Structure and Significance
Arnold Koslow

It is surprising that many of the discussions of the role and status of the
law of inertia conclude with remarks which hold for theoretical statements
in general. Thus, one writer1 has noted, correctly I think, that if the law
of inertia is an empirical proposition, then it is not an empirical generaliza¬
tion. “All simians are hairy,” is an empirical generalization; it is possible,
given our beliefs about the world, to support this proposition directly
with many and varied instances of hirsute simians. But Newton’s first law
of motion cannot rest upon this kind of direct instantial support because
such instances do not exist. Newton’s law states that if no forces are
present, acting upon a body, then the body preserves its velocity. Thus, if
the body is at rest, it remains at rest, and if it is in motion with a certain
velocity, then it continues to move with the same speed in a straight line.
Given our belief that every body attracts every other body with a force
that is inversely proportional to the square of the distance between them,
it seems to follow that any instance of bodies actually in motion cannot
be one in which no forces act. If Newton’s first law of motion is not an
inductive generalization, then it is on a par with typically theoretical state¬
ments of the physical sciences. Such a conclusion, though true, fails to
bring out the importance and the centrality which the law of inertia has
for theories of motion. Other writers have urged, for various reasons,2
that the law of inertia is not an empirical proposition simpliciter. Hanson,
for example, seems to suggest that there is no one law of inertia but a
“family of schemata.” Certain of its terms are “semantically linked” to
the others so that the meaning of its terms can be unpacked from the
remainder. According to Hanson, a physicist, by deciding what status to
give to the law of inertia, sets forth the logical structure of mechanics by
indicating which terms are primitive and which are derivative. Thus
Hanson maintains that “in some contexts the law may express a priori
relationships within mechanical theory, e.g., the semantical connections
between “uniform,” “rectilinear,” “ad infinitum,” “acceleration,” etc.
On other occasions, the law may reflect, a posteriori, a range of facts that
support mechanics as a set of empirical descriptions and pragmatically
549
550 4. HISTORICAL STUDIES

useful calculational techniques.”3 I am unsure whether Hanson means


that the same law is sometimes a priori, sometimes a posteriori, or that
the same linguistic expression is used sometimes to reflect a priori relations
and sometimes to reflect empirical matters. The latter is less obscure, but
surely a less exciting thesis.4 It is also one which is true of many lawful
expressions other than that of the law of inertia.
In this paper I would like to suggest and explore some of the ways in
which the law of inertia has had a special role within physical theory.
In the course of the discussion, I will say something about the characteriza¬
tion of those laws which are laws of inertia, and suggest why discussions
which are limited to an evaluation of the law as empirical, a priori—or
worse, some contextually dependent hybrid—fall short of a just apprecia¬
tion of this special kind of physical law. I shall suggest how the peculiar
structure of laws of inertia could encourage these contrasting descriptions
of its logical status.

II

The law of inertia is either a part of a physical theory about the motion
of bodies, or it is supplementary to it. In the latter case, the law is not
deducible from the theory. For example, Descartes’ law of inertia seems
to be a supplement to his theory of motion. The latter consisted of a
series of laws stating which kinds of motion would follow certain kinds
of collision or impact between bodies. Descartes supplemented his theory
of motion with the law of inertia: If a body is at rest, it remains at rest;
if it is in motion, then it continues to move with a constant speed in a
straight line, unless it collides with other bodies.5 This law is significantly
different from Newton’s first law of motion because of its explicit stress
upon impact as the sole source of change from an inertial motion. Never¬
theless, granted that a body is either in collision or not, the law of inertia
proposed by Descartes serves as a natural complement to his theory of
motion, although it is not a deductive consequence of that theory.
In contrast, Newton’s law of inertia is of course an integral part of his
theory of motion since it is one of his three laws of motion. Consequently,
his law of inertia is deducible from his theory of motion.
Still, it is true that his law of inertia complements his second law of
motion which states that the total force acting upon a body is equal to
the product of the body’s mass and its acceleration. Occasionally some
writers have made the further point that Newton’s first law is a deductive
consequence of the second law of motion. This remark seems initially
plausible, and then, after a moment’s reflection, seems initially unplausible.
In the Principia6 the three laws of motion are stated, each followed by
ARNOLD KOSLOW 551

a small expository section, and all included in the space of two medium¬
sized pages. The remainder of that section, called “Axioms or Laws of
Motion,” contains a deduction of eight corollaries and a large concluding
Scholium. No one needs to be reminded of the deductive rigor and
organization of the Principia. It therefore borders on the incredible that
Newton neglected to point out the deductive connection between his first
and second laws of motion, especially since the supposed deduction is so
simple. One deduction proceeds as follows: The total force acting upon a
body is the product of its mass and acceleration. If the total force is zero,
then the acceleration is zero, because the mass of a body is positive. There¬
fore the velocity of the body is constant. What are we to say about this
simple proof and the consequent redundancy of the Principia?
It is poor policy to suggest that Newton was a mathematical dunce.
A more credible alternative is that Newton understood his first and second
laws in a slightly different way than we do at present, so that his law of
inertia complemented his second law of motion but was not a deductive
consequence of it.
Since this point is not the major target of this paper, I would like
only to sketch one reason in support of my view. Newton believed that
forces which act upon a body are external causes whose effect is the
body’s motion. These forces arise somehow from the presence and motion
of other bodies, but the exact manner in which forces were produced was
sometimes unknown and therefore left unspecified. Forces are causes
which have true motions as their effects. On the other hand Newton also
believed that certain forces could be produced by motion with respect to
Absolute Space. For example, he thought that if a body rotated with
respect to Absolute Space, then this motion would give rise to or produce
a centrifugal force which in turn would produce certain distortions. The
entity Absolute Space was assigned a dual role: It was a frame of reference
for Newton’s theory of motion, and it also had a causal role within that
theory.7 Ernst Mach tried to find a replacement for Absolute Space which
would preserve both roles which Newton gave to it. Hertz, though, thought
that Newton had taken forces “twice over”—once as cause and once as
effect. And this revealed, he thought, the obscurity of Newtonian force
and motivated, in part, Hertz’s own efforts to reformulate the science of
mechanics.
Newton’s second law, I suggest, does not state without qualification that
the total force on a body equals mass times acceleration. This law has
sense only if there are forces or external causes acting upon a body. That
is, the second law should read as follows: If there are forces acting upon
a body, then the total of all these forces equals the product of the body’s
mass and acceleration. Under certain conditions, then, a quantitative
relation holds between a cause or number of causes and their effect.
552 4. HISTORICAL STUDIES

What happens when there are no external causes acting upon a body?
Newton’s first law is supposed to cover this situation. It states that if there
are no forces acting upon a body, then it has zero acceleration. Once the
first and second laws are described this way, it becomes clear that the first
law complements the second but is not a deductive consequence of it. A
statement of what happens when causes are present does not entail a
statement describing what happens when those causes are absent.

Ill

There is an obvious difference in the specific content of the laws of


inertia of Descartes and Newton. Yet each, despite the difference, is a law
of inertia, and each statement complements its respective theory of motion.
I shall refer to their similitude by calling these statements inertialike, and
I shall refer to the second feature by saying that each statement is inertial,
given T, its theory of motion. It is through a study of the systematic inter¬
connection of these two concepts that the methodological distinctiveness of
the laws of inertia emerges.
We shall have more to say about the inner structure of inertialike state¬
ments below. Postponing that, I shall say that a statement is inertial or
inertial, simpliciter if and only if it is inertialike and there is some theory,
such that the statement is inertial, given it. Thus, if a statement is inertial,
or inertial, given T, then it is inertialike. If it is inertial, given T, then it
is inertial. The converses are not necessarily true so that in order of
strength we have “inertial, given T,” “inertial,” and “inertialike.” Let us
try to say something more definite about inertialike statements.

IY

No one would deny, I think, that the inertialike statements usually


associated with Aristotle, Kepler, Galileo, Descartes, and Newton are
manifestly different in content. Yet these differences are not so important
if we wish to understand why these laws seem to be more similar to each
other than any of them is to a law about the thermal behavior of metals.
Each states that under certain conditions which we call normal, a body
undergoes its natural motion. The Cartesian law of inertia differs from the
Newtonian one over the condition of normalcy. According to Newton,
the normal condition is one in which no forces act upon a body; according
to Descartes, a situation is normal if and only if it is collision free. But both
seem agreed that the motion of a body is natural if and only if it is velocity¬
preserving. The Galilean law of inertia, however, differs from both the
Cartesian and Newtonian ones in its condition of natural motion. Galileo
ARNOLD KOSLOW 553

seems to have believed that bodies move naturally if they move on cir¬
cular paths around the center of the earth with constant speed. The
Aristotelian and Keplerian inertialike statements also have the form of
conditionals which specify certain normal conditions and those natural
motions which take place under them. Clearly many conditions have
been offered as normal, and an equal variety of conditions have been
considered the natural motion of bodies. We may well ask whether any
type of motion is excluded as unnatural or any type of circumstance is
nonnormal. Despite the variety, I think that there are some broad con¬
ditions which have to be satisfied. For example, it seems to be true that if a
type of motion is natural, then all other motions are composed of motions
of that kind or resolvable into them. Uniform circular motion and uniform
rectilinear motion both meet this demand, which shows that the resolvabil¬
ity condition is a necessary but not a sufficient condition for natural
motion.8
There is another condition which all natural types of motion satisfy:
They are—for want of a better term—explanatorily homogeneous or
uniform. If “M” denotes a type of natural motion referred to in a law of
inertia, then that law of inertia will be related to a theory of motion T, in
a special way, which we will discuss below. “M” is explanatorily uniform
if and only if it is not the disjunction of two conditions, “M*” and “M**,”
such that the theory T can explain why a natural motion is of the kind M*
rather than M**. All natural motions are therefore on a par with respect
to the theory’s explanatory resources. The familiar argument that classical
mechanics cannot distinguish between rest and uniform motion is an
argument for the homogeneity of the natural motion employed by Descartes
and Newton.
Parallel remarks also hold true for the condition of normalcy. It seems
necessary to any condition of normalcy that it describe a body in a situa¬
tion when it is uninfluenced by the action of forces. The Cartesian and
Newtonian conditions of normalcy obviously describe bodies when they
are free from such influences, and the Galilean condition also seems to
be a condition of this kind. For Galileo believed that changes in the
distance between a body and the center of gravitation were the cause of
the body’s change in velocity. It followed that if bodies were uninfluenced,
that is, if they kept a fixed distance from the center of gravitation, then
there would be no change in velocity. It was by an argument like this
one that Galileo concluded that the natural motion for bodies is a con¬
centric circle around the earth, assuming that the center of the earth and
the center of gravitation coincide.9 One condition seems necessary to all
conditions of normality: They must describe as normal the special case
of a body completely by itself. This restriction accords well with the
belief of all these writers that the forces and influences upon bodies arise
in some way from the presence and motion of other bodies. In the absence
554 4. HISTORICAL STUDIES

of other bodies, such forces should also disappear. The requirement that
normal conditions always include the case of the isolated body is clearly
not sufficient to determine one and only one normal condition.
Questions of truth and even specific content aside, every one of these
laws of inertia specifies that whenever a body is in normal circumstances
it will execute a natural motion. They have a conditional form, “N D M,”
where “N” and “M” are predicates which specify conditions of normalcy
and natural motion respectively, and “N D M” states that for any entity
x, if x is in a normal circumstance (N), then x executes a natural motion
(M).
The story thus far is not the entire story about inertialike statements.
They have played a central role in the explanations offered by theories of
motion, and that role has so far not been described. Also, nothing we
have said thus far explains why inertialike statements have always received
a great amount of attention—more so than the other laws of motion, more
attention surely than the various laws of collision and the principle of
the equality of action and reaction. The great interest generally shown
in the laws of inertia over other laws of motion is, I think, no mere
accident.

Usually an inertialike statement is not just one law among others.


It is related in a complex way to the theory of which it is a part; it is,
we shall say, inertial, given that theory. If L is true and inertialike, then
whenever a body is under normal circumstances, its motion will be natural.
For L to be inertial, given the theory T, the theory must explain why a
deviant or nonnatural type of motion takes place. Further, the theory
must be able to explain specific features of the deviant motion, using
information about the specific way in which the situation differs from
normal. For example, Newton’s first law is inertial, given his theory of
motion. The fact that a body is accelerating can be explained by using his
theory together with the information that forces are acting upon the body.
Further, specific features of the accelerated motion can be explained given
the theory together with additional or more refined information about the
kind of forces which are active. More briefly, the theory T should explain
why not-M takes place, given the information that not-N. Consequently, if
L (“N D M”) is inertial, given the theory T, then the converse of L, “M D
N,” is a deductive consequence of T.10 This shows that if L is inertial,
given T, then L complements T. For L tells us, if true, what happens when
N holds true. And T tells us what happens when N fails to hold, since it
yields “If not-N then not-M” as a deductive consequence.
ARNOLD KOSLOW 555

A law of inertia L is a statement which is inertialike, inertial, given a


theory T, and true. The last condition is necessary. Were L false, it would
not be a law, and therefore not a law of inertia. Laws of inertia, truth
aside, are inertialike and inertial, given a theory of motion T. And during
the remainder of this essay I will speak of laws of inertia—their truth
aside.
There have been many changes in the law of inertia and we will not
review here the empirical and conceptual considerations which warranted
their acceptance or rejection. Consider though the difference between
Kepler’s law of inertia and that of Newton. Keplerian inertia states that
under normal circumstances bodies remain at rest, if at rest, while
Newtonian inertia requires the conservation of velocity under normal
conditions. Both statements are inertialike and have associated theories
of motion. Kepler’s theory, it will be recalled, was more a theory of
planetary than terrestrial motion. He believed that the planets moved
because certain radial emanations from the sun turned with the sun like
the spokes of a wheel and gave the planets tangential velocities. Each of the
planets also “experienced” an attraction toward the sun whose intensity
varied with the distance from it. Newton’s theory of motion is very different,
yet we think of Kepler’s law of inertia and Newton’s as closely related.
The differences are of course great. Both statements are inertialike, but
they differ in content and there is not even a formal similarity between
them. Both are related to theories of motion, but we have just noted how
different those theories are from each other. Despite these differences,
we think of the change from Kepler to Newton as a change in the law of
inertia rather than a change of the law of inertia. The source of the kinship
lies in the preservation of two basic properties, that of being inertialike
and that of being inertial, given a theory. Changes of a law of inertia
are far more radical. They take place if and only if L has been replaced
by a law which fails to be inertialike, or, if inertialike, which fails to be
inertial given the theory of motion.
I should warn the reader that the distinction between changes in and
changes of the law of inertia involve a comparison of the products of
scientific research. They are not a substitute for a description of the
empirical and conceptual background which made such changes necessary,
nor are they surrogates for a description of the way in which such changes
were convincingly brought about.
These distinctions help mark off the major tendencies and movements
in the history of science. But they also teach us something of the complexity
of a term like “inertia” as it is used in the physical sciences. I think we
can understand how key terms like “inertia of a body,” “inertial motion,”
and “law of inertia” have an invariant sense despite the changing back¬
ground of laws and theories.
556 4. HISTORICAL STUDIES

VI

I would like to suggest two ways in which the law of inertia has a
special importance for theories of motion. First, there is a sense in which
a theory of motion can have one and only one law which is inertial, given
it. Second, the explanatory power of one theory of motion with respect to
another is reflected in the scope of their respective laws of inertia.
The first point can be made clear by a number of theorems, the first
of which tells us something about the uniqueness of the consequent
condition of a law of inertia. The second theorem tells us something
similar about the antecedent condition.
Theorem 1. If “N D M” and “N D M*” are two true inertialike state¬
ments, and both are inertial, given T, then “M” and “M*” must be co-
extensional, assuming T. That is, the predicates “M” and “M*” apply and
fail to apply to exactly the same entities.
The proof is direct. Let /3 be a body. Suppose that “M” holds true of f3.
Since “N D M” is inertial, given T, it follows that “M D N” is deducible
from T. Therefore, given T, (3 has N since it has M. But the conditional
“N D M*” is true, so that f3 has M*. The converse is also true. Suppose (3
has M*. Since “N D M*” is inertial, given T, the conditional “M* D N”
follows from T. Therefore, assuming T, f3 has N. But “N D M” is true, so
j3 also has M. Consequently “M” and “M*” are coextensive.
Theorem 2. If “N D M” and “N* D M” are two true inertialike state¬
ments, both inertial, given a theory T, then, assuming T, “N” and “N*”
must be coextensive. The proof is similar to the one used for Theorem 1.
Both theorems can be obtained from a third:
Theorem 3. If “N D M” is a true inertialike statement, which is inertial,
given T, then assuming T, “N” is a maximal predicate with respect to the
predicate “M”. That is, if “N* D M” is another true inertialike statement,
which is inertial given T, then every N* is an N. It is also true that the pred¬
icate “M” is maximal with respect to the predicate “N”. The proof is similar
to those already used in the previous theorems. One fact ought to be
noticed. Despite these theorems, it still seems possible that there can be
more than one law of inertia, even with respect to the same theory T. Thus,
it might still be true that there are two true inertialike statements, “N D M”
and “N* D M*,” both of which are inertial, given the theory T.
This possibility seems a strange one, and it seems to be counter to our
understanding of the role which laws of inertia play with respect to their
theories of motion. I do not know of any case in the history of science where
two proponents of the same theory of motion defended different laws of
inertia. Further, a theory of motion complements a law of inertia by ex¬
plaining those cases not covered by the law of inertia. It therefore seems
ARNOLD KOSLOW 557

strange to think of those uncovered cases as divided into two distinct


groups. This is precisely what would happen if there were two true
inertialike statements, both inertial, given the theory T. And what reason
could there be for this subdivision of the unexplained cases of motion?
If there is an agreement by physicists over the specific predicates to be
used either for normalcy or natural motion, then the preceding theorems
imply that there is only one law of inertia, given T. However, even without
such an accord among physicists, it follows from some plausible assump¬
tions that there is only one law of inertia, given T. This result is best sum¬
marized by three small theorems.
Suppose that there were two true inertialike statements, “N D M” and
“N* D M*,” which are inertial, given T. Suppose further that every N
is an N*, or conversely, that every N* is an N. In the former case, since
every N* is an M*, it follows that every N is an M*. But it is also true that
every N is an M. By our previous theorems it follows that “M” and “M*”
must be coextensive predicates. (Remember that “M” and “M*” are both
maximal predicates given N.) Since every N* is an M*, it now follows that
every N* is M. But it is also true that every N is M. Again, by our previous
theorems, “N” and “N*” must be coextensive. The corresponding normal
conditions and natural conditions are coextensive. The same conclusion
follows from the second case, when every N is N*. Thus we have proved
Theorem 4. Let “N D M” and “N* D M*” be two true inertialike
statements, both inertial, given theory T. If either every N is N*, or every
N* is N, it follows that “N” and “N*” are coextensive predicates, and so
too are “M” and “M*”.
In point of truth value, there is nothing to choose between the two in¬
ertialike statements. But the condition whereby at least one of the two con¬
ditions of normalcy includes the other is not an easy condition to verify, and
there are other assumptions which are more plausible and lead to the same
conclusion.
Suppose “N D M” and “N* D M*” to be two laws of inertia, given T.
“M” and “M*” are two predicates each of which represents a type of
natural motion. Is the disjunctive predicate, “M-or-M*” that is, the condi¬
tion of being either an M-kind or an M*-kind of motion, itself a natural
type of motion? The disjunction of two kinds of natural motion seems a
very obvious unobjectionable way in which to consolidate the two. If a
man believes M-motions and M*-motions both to be natural, then surely
in calling M- or M*-motions natural, he is calling just those motions natural
which he previously called natural. The consolidation is therefore a trivial
one.
However, even though the disjunctive predicate represents a natural
condition of motion, there may be an important historical point in con¬
tinuing to distinguish between the disjuncts. The Newtonian condition of
natural motion as one of continued rest, or constant velocity in a straight
558 4. HISTORICAL STUDIES

line, is a disjunction of the type of motion which Kepler thought natural,


and the type of natural motion which is traditionally, though wrongly,
ascribed to Galileo. Both Descartes and Newton thought that the Keplerian
condition and the condition attributed to Galileo each represented a type of
natural motion. Believing this, their consolidation of the two by disjunction
was an unobjectionable step. The history of science shows us however that
the transition from the view that only the Keplerian natural motions are
natural to the view that both the Keplerian and Galilean types of motion
are natural is an achievement of a very high order. But this achievement is
not to be confused with disjunction. Consolidation by disjunction is, I think,
a plausible step. Assuming its reasonableness, it follows that there is only
one law of inertia, given a theory T. More precisely, we have a fifth
theorem:
Theorem 5. Let “N D M” and “N* D M*” be two true inertialike state¬
ments, which are inertial, given T. Suppose further that the predicate “M-or-
M*” represents a natural kind of motion. Then the predicates “M” and
“M*” are coextensive, and so too are the predicates “N” and “N*.” The
proof is as follows: Since “N D M” and “N* D M*” are true statements,
it follows that “N D (M-or-M*)” and “N* D (M-or-M*)” are also true.
Moreover, since “M-or-M*” is a condition of natural motion, it follows that
each of these statements is also inertialike. Further, each statement is in¬
ertial, given T. Consider the statement “N D (M-or-M*).” Features of
motions which fail to be M-or-M* motions are features of motions which
fail to be M-motions. However, “N D M” is inertial, given T, so that
(features of) motions which fail to have M can be explained using the
theory T together with the fact that the situation is not of type N. Therefore
“N D (M-or-M*)” is inertial, given T. In a similar manner, “N* D (M-
or-M*)” is also shown to be inertial, given T. Thus “N D (M-or-M*)”
and “N* D (M-or-M*)” are true inertialike statements which are inertial,
given T. By Theorem 3, therefore, both “N” and “N*” are maximal predi¬
cates, given “M-or-M*.” Consequently “N” and “N*” are coextensive.
Therefore the two conditional statements “N D M” and “N D M*” (in¬
stead of “N* D M*”) are true, inertialike, and inertial, given T. Again,
by Therorem 3, we conclude that “M” and “M*” are maximal predicates,
given “N.” Consequently, “M” and “M*” are also coextensive. There is
therefore no difference between the two laws of inertia, “N D M” and “N*
D M*” other than a choice of which specific predicate of a number of
coextensive ones is used in the formulation of the statement. There is there¬
fore only one law of inertia, given the theory T.
There is a complementary result which concerns the conditions of
normalcy rather than those of natural motion. Given two conditions of
normalcy, “N” and “N*,” we can form their conjunction “N-and-N*.”
This new condition is not an empty one, for it will be recalled that there are
certain situations which are qualified as normal by every condition of nor-
ARNOLD KOSLOW 559

malcy. They are the cases in which a body is isolated from all others or is the
only body present. If we assume further that the condition “N-and-N*” is
not only nonempty but normal, then we shall say that the conjunctive condi¬
tion is satisfied. It is easy to see that the following theorem is true.
Theorem 6 . If “N D M” and “N* D M*” are two true inertialike state¬
ments, both inertial, given T, and the conjunctive condition is satisfied, then
there is only one law of inertia, given T.
We mentioned above that in order for a statement to be a law of inertia,
it had to stand in a certain relation to a theory of motion; that is, it had to
be inertial, given that theory. We know that whenever a theory of motion
has a law of inertia, then it has exactly one. We understand that under
certain plausible conditions, the law of inertia of a given theory is unique,
so that we may talk of the theory’s law of inertia. But we do not know how
these uniquely associated laws of inertia are related to each other. For
example, if two theories are logically inequivalent, are their laws of inertia
logically inequivalent? The following remarks are not original with myself,
but they underscore the point that what is methodologically interesting and
distinctive about the law of inertia is connected with the fact that it is related
in a special way to a dynamical theory.
Suppose that there are two theories T and T* such that T* is a sub-theory
of T. This means that every logical consequence of T* is also a logical con¬
sequence of T. On the assumption that each theory has its associated law of
inertia, how are those laws related? It is easy to show that the same law of
inertia must be associated both with the theory T and its sub-theory T*. Let
us denote by “N* D M*” the law of inertia of T*. It is true and inertialike.
Further, “N* D M*” is inertial, given T. To see why this is so, suppose that
there is a deviation from an M*-type motion. The deviant motion can be
explained by the theory T* together with the information that there has been
a deviation from the N*-state. Certainly the same explanation is available,
using the theory T, since T* is a sub-theory of T. Consequently “N* D M*”
is a true inertialike statement which is inertial, given T. But this is also true
of “N D M.” By Theorem 5, they are the same law of inertia, up to an inter¬
change of coextensive predicates.
This argument seems to show that changes which increase the deductive
power of a theory do not bring with them any change in the law of inertia.
Another more important relation between theories of motion and their
laws of inertia concerns the explanatory power of theories. Roughly speak¬
ing, the greater the explanatory power of a theory of motion, the more nar¬
row will be the scope of its law of inertia.
There are many difficulties connected with an adequate description of the
relations “T has greater explanatory power than T*” or “T has the same
explanatory power as T*.” We shall say that one theory has greater
explanatory power than another if and only if the former theory’s ex¬
planatory consequences—those events or regularities which it explains—
560 4. HISTORICAL STUDIES

include the explanatory consequences of the latter theory. Further, two


theories have the same explanatory power if and only if they have the same
explanatory consequences. The set of explanatory consequences is not very
well defined, since a theory does not explain events; it explains events under
given descriptions.11 Consequently it has to be specified whether it is events
which belong to the set of explanatory consequences or events-under-given-
descriptions. If the former, then when is an event explained by a theory?
When it is explained under at least one description? When it is explained
under “all” descriptions of the event? On the other hand, if it is events under
given descriptions which belong to the consequence set, is it only one
event-under-a-given description or should all the events-under-their-descrip-
tions be included? Further, it is not generally true that any two theories can
be compared with respect to their explanatory power since the sets of their
consequences, however specified, may overlap or be disjoint. Comparability
of theories with respect to explanatory power cannot be taken for granted.
We shall think of the explanatory consequences of a theory as containing
the laws explained, together with events under their various descriptions, if
they are explained under those descriptions. The relation between the ex¬
planatory power of a theory and the scope of its law of inertia can be stated
more sharply in the form of a theorem, the seventh thus far. If what it states
is correct, there are important implications for some recent discussions of
the logical status of the law of inertia.
Theorem 7. Let T and T* be two theories of motion and “N D M” and
“N* D M*” be their respective laws of inertia. We shall assume that T and
T* are comparable with respect to explanatory power. That is, either T has
the same explanatory power as T* (T T*), or T has greater explanatory
power than T* (T >T*), or T* has greater explanatory power than T
(T* > T), and one and only one of the three cases holds true. Then (1) if
T^T*, then every M is M* and conversely; (2) if T >T*, then every
M is M*; (3) if T* > T, then every M* is M.
The proof proceeds by an indirect argument in each case. Consider the
second case, that is, when T has greater explanatory power than T*. Sup¬
pose there were some entity a which has M but does not have M*. Since a
has a non-M* motion, by hypothesis the theory T* can explain certain fea¬
tures of its motion. However, the theory T cannot explain those features of
a’s motion because that motion is of type M, and the theory T does not
explain features of its natural motions. The reason is that natural motions
are explanatorily homogeneous. To see that this is so, suppose that a has
M but not M*. Consider some feature F of this non-M* motion. For ex¬
ample, if “M*” is the predicate “is an unaccelerated motion,” then non-M*
motions are accelerated motions, and a feature of this kind of motion would
be just a further condition upon accelerated motion such as “is an acceler¬
ated motion which varies periodically with the time.” T* by hypothesis can
explain why a has F rather than not. On the other hand T cannot explain
ARNOLD KOSLOW 561

why a has F rather than not. The reason is that the characteristic F serves to
divide the class of natural motions which are M into two kinds: those which
are both M and F and those which are M but not F. If T could explain why
a has F rather than not, “M” would be explanatorily inhomogeneous and
therefore fail to be natural. Consequently T* can explain features of as
motion which T cannot. Therefore T cannot have greater explanatory power
than T*. This argument also shows that T cannot have the same explanatory
power as T*. Therefore, assuming comparability of T and T*, it follows
that T* has greater explanatory power than T. But this contradicts our initial
assumption that T has greater explanatory power than T*. Therefore it is
false that there is an a which has M but does not have M*, or, every M is M*.
This concludes the proof for the second case. A similar argument proves the
third case, when T* has greater explanatory power than T. For the first of
our three cases, when the two theories have the same power, we can assume
that there is an entity a which has M but fails to have M*, and there is an
entity f3 which has M* but fails to have M. We can deduce contradictions
in either of these cases by arguments similar to those already given. There¬
fore “M” and “M*” are coextensive predicates, and the proof of Theorem
7 is complete.
There is one immediate and very serious objection to this result. It seems
patently true that Galileo’s theory of motion has greater explanatory power
than any of the ancient theories. However, probably all of the pre-Galilean
theories of motion, including Kepler’s, characterized departures from rest
as unnatural and in need of explanation. Every Keplerian natural motion is
therefore a Galilean natural motion. If the seventh theorem is sound, then
Kepler’s theory of motion has greater explanatory power than Galileo’s.
Why should this be so? Assume that the two theories are comparable. If
they were of the same power, then the two classes of natural motions would
be identical, and this is not the case. Further, if Galileo’s theory had the
greater power, then the Galilean natural motions would be included among
the Keplerian ones. And this is not the case. Therefore Galileo’s is the
theory of lesser explanatory power. This too is an unacceptable, highly
counterintuitive conclusion. The absurdity we feel about this conclusion
rests on the firm conviction that Kepler’s theory of tangential jolts and plan¬
etary yearnings for the sun is false. Since there can be no explanation which
uses premises known to be false, it follows that Kepler’s theory has no ex¬
planatory consequences. This fact explains why we find the statement ab¬
surd rather than simply false. Fortunately, the fact that Kepler’s theory has
no explanatory consequences also shows that the counterintuitive conclusion
is not a corollary of the seventh theorem. The reason is that the proof of the
theorem requires that the theory T* have explanatory consequences. But
we might go one step further and argue that the supposed conclusion, if
properly understood, strengthens rather than undermines our belief in the
theorem. Suppose we abstract from the falsity of Kepler’s theory of motion
562 4. HISTORICAL STUDIES

and ask what would have been the case had Kepler’s theory of motion been
successful?12 What if Kepler’s theory provided explanations of every change
in the position of a body while Galileo’s, at best, explained only changes in
velocity? Then surely the theorem would yield the correct conclusion:
Kepler’s theory would have greater explanatory power than Galileo’s.

VII

The relation between the explanatory power of theories and the scope of
their laws of inertia has consequences for a thesis recently defended by Brian
Ellis in a very illuminating study.13 According to Ellis, the law of inertia is
“conventional.” His thesis complements an earlier claim by Duhem. Duhem
argued that in the face of a disconfirming experimental outcome, any select
part of a theory could be retained, provided suitable changes could be car¬
ried out in the remaining part of the theory. And Duhem suggested that
such changes could always be carried out in a satisfactory manner. The situ¬
ation which Ellis asks us to consider is quite different. He seems to argue
that even when no contrary experimental result arises it is possible to change
any one of the laws of Newtonian mechanics, provided suitable changes are
wrought on the theoretical remainder. It is this feature of the laws of me¬
chanics which Ellis terms their “conventionality.” In the study referred to,
he tried to show, in particular, that Newton’s law of inertia is conventional.
The argument is based upon a specific replacement which Ellis offers for the
Newtonian law of inertia.
Newton’s first law states that under certain normal circumstances, bodies
execute velocity-preserving motion. Instead of this law, Ellis proposes
another which he also believes to be inertialike: Under normal conditions,
bodies execute natural motions, that is, those motions which bodies undergo
when they are solely under the influence of gravitational attraction. The
motions which he has singled out are of course not those which are described
in Newton’s first law. In addition to this new principle of natural motion,
Ellis sketched the remaining laws of his new system of dynamics and claimed
that his dynamics taken together with his law of inertia is “in every way as
adequate as Newton’s.”14 If his assertion of equal adequacy is correct, then
the law of inertia is conventional in his sense of that term.
But Ellis’ claim is certainly mistaken. There are many factors which have
to be spelled out in explaining how one theory is “in every way as adequate
as another,” but surely one of the requirements is that the two theories have
the same explanatory power. However, the two systems of mechanics,
Newton’s and Ellis’, cannot have the same explanatory power. If they did,
then by our last theorem, their conditions of natural motion would have to
be coextensive. We have already noted that this is not the case, so that Ellis’
conventionality thesis is simply wrong.
ARNOLD KOSLOW 563

Nevertheless, Ellis’ mechanics, if sound, raises larger issues which are


similar to those raised by Ernst Mach’s critique of Newtonian dynamics. In
both cases we are offered a statement, presumed to be true, together with a
theory of motion. The statement is supposed to be a law of inertia which
differs from Newton’s. Ellis’ choice is one which deliberately contrasts with
the Newtonian law. Mach, on the other hand, wished to capture what
Newton intended to say but had said poorly. In each case, the law proposed
is inertial, given the respective theories of motion. But a law of inertia must
also be inertialike, and both Mach’s and Ellis’ are unsatisfactory on that
count. Mach, for example, used an antecedent condition in his statement of
the law of inertia which was not normal. It could not be normal. For all
normal predicates hold true of the case when a body is isolated from other
bodies or an entity is the only body in the universe. Mach held that such
cases were without sense, meaning that so-called descriptions of these cases
had no role to play in any sound scientific body of knowledge. Thus, Mach,
in a rather silent revolution of his own, introduced a change of the law of
inertia rather than a change in it. He was aware of the need for a change of
the law. But this change had to be radical. It did not consist in replacing one
normal predicate by another normal one; it consisted in the more radical
step of changing the criteria of normalcy itself.
The conventionalist problem as I understand it in Ellis’ paper is con¬
cerned primarily with changes in the law of inertia, so that the larger and
more difficult problems of revolutionary change are present but not at issue.

VIII

If I am correct about the complexity of laws of inertia, then there are


certain implications for the support or confirmation of laws of this kind.
Truth is often declared to be one of the goals of science because scientists
seek to discover laws of nature. But scientists not only seek laws simpliciter,
they often seek laws of a certain kind because laws of that kind are especially
important. When a law of a certain kind is sought, the confirmation of the
result will have a blunt as well as a subtle aspect. The scientist wants
to show not merely that a statement is true; he wants to show that a certain
kind of statement is true. Consequently he will try to show that the state¬
ment is true and to show also that it is the kind of law which he sought.
Often it is forgotten that the support or confirmation of a certain kind of
law requires the completion of these two tasks.
Laws of inertia are important for physicists not simply because they are
true statements. These laws tell us what happens in all those cases not ex¬
plained by the relevant theory. They are uniquely associated with a theory
of motion; in a way, they serve as the hallmark of the theory and provide a
mark of that theory’s explanatory power. The law, it will be recalled, satis-
564 4. HISTORICAL STUDIES

fies two conditions: It is a true inertialike statement, and it is inertial, given


T. If we provide support for only the first condition’s satisfaction, then we
cannot say that we have support for a law of inertia. The most that we can
say is that we have support for an inertialike statement.
Some authors who have written on the status of the law of inertia have
held that Newton’s first law has the character of a definition; others have
held it to be an empirical proposition. Recently, Hanson has argued that in
some contexts the law is definitional, and in others, empirical.10 However,
even if an inertialike statement L, of the form “N D M,” is true, it does not
follow that there is a theory which explains all the divergences from M-type
motions. This observation holds true even if the truth of L is a nonempirical
matter. You cannot ground the truth of a law of inertia by showing only that
it is true and inertialike.
Other writers have restricted their discussion of the truth of a law of in¬
ertia to the second condition. Poincare, in one work,16 suggested that the
Newtonian law of inertia is true in a special way which is not clearly a
priori nor clearly empirical. He argued that the law of inertia has empirical
confirmation in some domains such as astronomy. But the law holds true in
all cases because “it can neither be confirmed nor contradicted by experi¬
ment.”17 For example, according to Poincare, the law says in effect that if
there are n bodies under normal circumstances, then the acceleration of a
body is a function of its position and the positions and velocities of
neighboring bodies. Suppose it were found instead that the variation of the
acceleration of these bodies depended upon the positions, velocities, and
accelerations of neighboring bodies. Then, by assuming that there are a cer¬
tain number of bodies present but not visible, whose existence has already
been admitted, it can be shown that the acceleration of each of the bodies
does depend upon the positions and velocities of its neighboring bodies.
Thus every divergence from the acceleration condition can be explained.
From this truth, Poincare drew the conclusion that the law of inertia holds
in all cases. This argument is mistaken. If there is a purported law “If A
then B” and every case of non-B can be explained using a theory T together
with the information that not-A, it does not follow that the conditional “If
A then B” is true. Consider this counterexample. T is a primitive theory
which includes the law that every person has the blood-type of his father,
and every person has one and only one blood-type. We can explain why
an individual, Hiram, is not the child of X, if we use our theory together
with the information that Hiram does not have the same blood-type as X.
Nevertheless, it is false that any person who has the same blood-type as X
is the child of X. Consequently, even if a statement were inertial, given T,
it does not follow that the statement is true. And it does not follow that
we have provided support for a law of inertia. Both conditions—true and
inertialike, and inertial, given a theory T—are independently true, if true
ARNOLD KOSLOW 565

at all. Granted that both tasks have been carried out so that we can speak
of the support for the law of inertia, two views dominate. The first view
claims that the law of inertia is empirical, though the evidence for it is not
of the direct instantial type. The observation, however, applies equally
well to most theoretical statements. The second view maintains that the law
of inertia is a priori, perhaps analytic, perhaps definitional. But proponents
of this view have usually given the other laws of motion the same blessing.
It is not to truth, of whatever stripe, that we look when we seek what is
methodologically distinctive about the law of inertia.

IX

Even so, one might ask whether there is something distinctive about the
law which invites, perhaps even accommodates, such diverse characteriza¬
tions of its truth. There are several features of laws of inertia which might
have encouraged some people to think of them as definitional and others to
regard them as empirical. I do not think that the two features which I shall
discuss are the only ones which could motivate such contrasting descrip¬
tions of the truth of a law. They are, however, features which are more
closely associated with laws of inertia than with any other law, and on that
account they may be more relevant.
Given a theory of motion, its law Of inertia has a privileged uniqueness.
There is no alternative to the law which is also inertial, given T. Accordingly,
given the theory, it is logically impossible for there to be a change in its law
of inertia. And this observation might motivate us to model laws of inertia
after analytic truths. For the latter are by design without alternatives. On
the other hand, a law of inertia, since it is inertial given a theory of motion,
forms a natural complement to it. The theory explains what happens in
certain situations, while the law of inertia tells us what happens in the re¬
maining cases. Theory and law are so closely related in the general enterprise
of explaining motional phenomena that it seems unlikely that they would
differ in logical status. The fact that the theory is empirically grounded sug¬
gests that the same holds true for its law of inertia. Whether a law of inertia
is analytic or synthetic may be extremely difficult to settle. But calling it
analytic to stress its uniqueness and the fact that there can be no change in
it and calling the law synthetic to stress its relation to the theory of motion
are disastrous ways of calling attention to these two distinctive features of
laws of inertia. It provides a contrast where none should be drawn. We do
not have to stress that a law of inertia is inertialike at the expense of its
being inertial, given a theory. These features, singly and together, provide
clues to the complexity of the term “inertia” in some of its major contexts;
they tell us how laws of inertia are related to each other, to theories, and
566 4. HISTORICAL STUDIES

to the explanatory power of theories. These remarks provide us with some


methodologically distinctive features of the law of inertia which make
credible the importance usually accorded them. They reveal the law s
methodological character, and they take its theoretical status for granted.

NOTES
1. N. R. Hanson, “Newton’s First Law: A Philosopher’s Door into Natural
Philosophy,” in Beyond the Edge of Certainty, R. Colodny (ed.), Prentice-Hall,
Inc., 1965, pp. 6-28. .
2. Cf. H. Poincare, Science and Hypothesis, Dover Publications Inc., 1952,
Chapter VI, esp. pp. 91-98; N. R. Hanson, op. cit., and Brian Ellis, “The Origin
and Nature of Newton’s Laws of Motion,” in Beyond the Edge of Certainty, R.
Colodny (ed.), Prentice-Hall, Inc., 1965.
3. Hanson, op. cit., p. 20.
4. It should be noted that if this is Hanson’s thesis, then it is hard to understand
why he requires a “family of schemata,” when his argument seems to show only
that there is a single expression being used in a variety of ways.
5. R. Descartes, Principles of Philosophy, Part II, XXXVII, XXXIX, and
XL, in Oeuvres de Descartes, C. Adam and P. Tannery (eds.), vol. IX, 1910, Paris:
L. Cerf.
6. Sir Isaac Newton, Philosophia Naturalis Principia Mathematica, London:
1687; second revised edition, R. Cotes (ed.), Cambridge: 1713; third edition,
revised, H. Pemberton (ed.), London: 1726.
7. Sir Isaac Newton, Mathematical Principles of Natural Philosophy and the
System of the World. Tr. by A. Motte (1729); revised with an historical and
explanatory appendix by F. Cajori (Berkeley: University of California Press, 1947).
To cite just two contexts for the causal role Newton assigned to Absolute Space,
(a) just prior to his description of the bucket experiment, he says: “The effects
[sic] which distinguish absolute from relative motion are, the forces of receding from
the axis of circular motion. For there are no such forces in a circular motion
purely relative, but in a true and absolute circular motion, they are greater or less,
according to the quantity of motion,” (p. 10).
(b) “It is indeed a matter of great difficulty to discover and effectually to dis¬
tinguish, the true motions of particular bodies from the apparent; because the parts of
that immovable space, in which those motions are performed, do by no means come
under the observation of our senses. Yet the thing is not altogether desperate; for
we have some arguments to guide us, partly from the apparent motions, which are
the differences of the true motions; partly from the forces, which are the causes
and effects (sic) of the true motions.” (p. 12). For a different view of the causal
role of Absolute Space, cf. D. Shapere, “The Causal Efficacy of Space,” Philosophy
of Science, vol. 31, no. 2, p. 113.
8. That is, it is true that if R is a type of natural motion then all motions of
bodies are resolvable into R-type motion.
9. Galileo, Dialogue Concerning the Two Chief World Systems, Tr., with
revised notes by S. Drake. Foreword by A. Einstein, University of California Press;
1962, p. 28.
10. We have written the law as “N D M,” referring in this way to the longer
closed sentence “(x)(NxD Mx).” The claim in the text is that the converse
“(x)tMxD Nx)” is deductible from the theory T. To show this we require that
the explanation which T yields of deviant motions be expressible in a system of
postulates based upon a system of logic, such that the deduction theorem holds.
The underlying logic may have to be as complex as an applied functional calculus of
the fourth order since we require the use of analysis in the theoretical explanations.
Vide. A. Church, Introduction to Logic, Princeton University Press, 1956, p. 317,
fn. 520.
11. This significant concept of explanation and prediction of events under given
descriptions was first introduced and exploited in S. Morgenbesser’s “The Deductive
Model and its Qualifications,” in Induction: Some Current Issues, H. E. Kyburg Jr.
and E. Nagel (eds.), Wesleyan University Press, 1963, pp. 169-179.
12. Our abstraction from the falsity of Kepler’s theory means that we are no
longer considering the true explanatory power (or simply, the explanatory power)
ARNOLD KOSLOW 567

of theories, but their potential explanatory power. This distinction was introduced
by C. Hempel. Cf. Aspects of Scientific Explanation, Prentice-Hall, Inc., 1965, pp.
338. An analysis of the proof of Theorem 7 shows that it also holds for the relation
of greater potential explanatory power so that the argument offered in the text
is still sound.
13. Brian Ellis, “The Origin and Nature of Newton’s Laws of Motion,” in Beyond
the Edge of Certainty, R. Colodny (ed.), Prentice-Hall, Inc., 1965.
14. Ibid., pp. 41, 65.
15. Hanson, op. cit., pp. 12-13.
16. Poincare, op. cit., pp. 95-96.
17. Ibid., p. 97.
KANT’S PHILOSOPHY OF ARITHMETIC1
Charles Parsons

The interest and influence of Kant’s philosophy as a whole have certainly


been great enough so that this by itself would be enough to make Kant’s
philosophy of arithmetic of interest to historical scholars. It is also pos¬
sible to show the influence of Kant on a number of important later writers
on the foundations of mathematics, so that Kant has importance specifically
as a figure in the history of the philosophy of mathematics. However, my
own interest in this subject has been animated by the conviction that even
today what Kant has to say about mathematics, and arithmetic in particular,
is of interest to the philosopher and not merely to the historian of phi¬
losophy. However, I do not know how much of an argument the following
will be for this.
Kant does not discuss the philosophy of arithmetic at any great length,
so that it is virtually impossible to understand him without making use of
other material. What I have used consists mainly of two considerations: the
integration of Kant’s theoretical philosophy as a whole, and modem knowl¬
edge on the foundations of logic and mathematics. The justification for
using the second is twofold; first, I think experience shows that one does
not get far in understanding a philosopher unless one tries to think through
the problems on their own merits, and in this one must use what one knows;
second, if one is today to take Kant seriously as a philosopher of mathe¬
matics, one must confront him with this modern knowledge, which after
all in major respects shows immense progress from the situation in his
lifetime.
I shall be concentrating mainly on one question, which I think must be
answered before one goes farther with the subject: Why did Kant hold that
arithmetic depends on sensible intuition, indeed that arithmetical proposi¬
tions in some way refer to sensible intuition? This is, of course, closely
related to the question of why he regarded such propositions as synthetic
rather than analytic. In considering this question, one must very soon
consider Kant’s views on logic and its relation to arithmetic. Also since
the answer to the above question is much clearer if “arithmetic” is re¬
placed by “geometry,” we shall also give some consideration to Kant’s
views on geometry.
In order to clarify our problem, let us first briefly consider Kant’s con-
568
CHARLES PARSONS 569

cept of intuition. Intuition is a species of representation (Vorstellung) or,


in the language of Descartes and Locke, idea. Having intuitions is one of
the primary ways in which the mind can relate to or be conscious of
objects. The nearest thing to a definition in the Critique of Pure Reason
occurs in a classification of representations:

This [knowledge] is either intuition or concept (intuitus vel conceptus). The


former relates immediately to the object and is singular [einzeln], the latter
refers to it mediately by means of a feature which several things may have in
common [A 320=B 376-7].2

In the opening sentence of the Transcendental Aesthetic, Kant says:

In whatever manner and by whatever means a mode of knowledge may relate


to objects, intuition is that through which it is in immediate relation to them,
and to which all thought as a means is directed [A 16=B 33].

A passage in §1 of Jasche’s edition of Kant’s lectures on Logic reads:

All modes of knowledge, that is, all representations related to an object with
consciousness are either intuitions or concepts. The intuition is a singular repre¬
sentation (repraesentatio singularis), the concept a general (representatio per
notas communes) or reflected representation (representatio discursiva) .3

Intuitions are thus contrasted with concepts, which relate to objects only
mediately, by way of certain properties and by way of intuitions which
instantiate them and which relate indifferently to all the objects which
possess the required properties.
What is meant by calling an intuition a singular representation seems
quite clear. It can have only one individuated object. The objects to which
a concept “relates” are evidently those which fall under it, and these can
be any which have the property which the concept represents, so that a
concept will only in exceptional cases have a single object. Thus far, the
distinction corresponds to that between singular and general terms.
One might think that the criterion of “immediate relation to objects”
for being an intuition is just an obscure formulation of the singularity
condition. But it evidently means that the object of an intuition is in some
way directly present to the mind, as in perception, and that intuition is
thus a source, ultimately the only source, of immediate knowledge of objects.
Thus the fact that mathematics is based on intuition implies that it is im¬
mediate knowledge and thus, even though synthetic a priori, does not
require the elaborate justificatory argument which the Principles do
(A 87-B 120). By the immediacy criterion Kant’s conception of intuition
resembles Descartes’, while by the singularity criterion and his insistence on
a nonintuitive conceptual factor in all knowledge, Kant’s theory of intuition
differs from that of Descartes.
That what is immediately present to the mind are individual objects
seems to be an axiom of Kant’s epistemology, or one might also say
metaphysics, since it goes with the conviction that objects, the primary
570 4. HISTORICAL STUDIES

existences, are in the first instance individual objects. Thus what satisfies
the immediacy criterion of intuition will also satisfy the singularity criterion.
It does not seem that the converse must be true. The idea of a singular
representation formed from concepts seems quite natural to us. Such a
representation would relate to a single object if to any at all, but it hardly
seems immediately. By associating it with a definite description rather than
with a general term, we would distinguish it from a concept under which
exactly one object falls (even if necessarily). For Kant, however, the pas¬
sage from A 320 = B 376-7 seems to allow such a representation to be a
concept; this might also be suggested by the fact that the idea of God is
called a concept; it is nowhere suggested that it is an intuition. However,
Kant never remarks, so far as I know, on the implications of the possibility
of nonimmediate singular representations for the concept of intuition.
This omission may give support to a theory which has been advanced
by Jaakko Hintikka according to which the singularity criterion is the sole
defining criterion: An intuition is simply an individual representation.
In Kant and his immediate predecessors, the term “intuition” did not neces¬
sarily have anything to do with appeal to imagination or to direct perceptual
evidence. In the form of a paradox, we may perhaps say that the “intuitions”
Kant contemplated were not necessarily very intuitive. For Kant, an intuition
is simply anything which represents or stands for an individual object as dis¬
tinguished from general concepts.4

Many of the passages Hintikka cites also mention the immediacy criterion,
and it is not clear why Hintikka thinks it nonessential. The main reason,
which we shall consider later, is that this assumption supports a theory of
Beth and Hintikka to explain Kant’s notion of “construction of concepts
in intuition” and the resulting analysis of mathematical demonstration.
Another seems to be the absence of the immediacy criterion in the Logic
and the fact that Kant makes remarks on concepts which seem to exclude
essentially singular concepts and thus to imply that all singular representa¬
tions are intuitions.5
Hintikka also points out that the part of the Transcendental Aesthetic
where Kant argues that space is an intuition argues essentially that the
representation of space is singular. However, he has opened the Aesthetic
by stating the immediacy criterion (A 16 = B 33, cited above) and in the
proof of intuitivity he does say that space is given (B 39, also A 25).
Moreover, in arguments for the same thesis in the Inaugural Dissertation
of 1770, Kant does appeal to immediacy: Immediately after arguing that
space is a pure intuition because it is a “singular concept,” Kant says of
geometrical propositions that they “cannot be derived from any universal
notion of space but only as it were seen in space itself as if in something
concrete” (§ 15c, emphasis Kant’s). Later he says, “Geometry makes use
of principles which fall under the gaze of the mind.”
It seems to me that the textual evidence for Hintikka’s view is not
CHARLES PARSONS 571

sufficient to outweigh the clear statements and emphases on the immediacy


criterion, even though the alternative view must assume that Kant in dis¬
cussing these matters did not keep in mind the possibility of nonimmediate
singular representations.6 But Hintikka’s theory really stands or falls on
the interpretation of the role of intuition in mathematics.
A thesis about intuition which is of great importance for Kant is that
our mind can acquire intuitions of actual objects only by being affected
by them. Just what this “affection” is I shall not venture to say, but it
involves for the subject a certain passivity, so that our perceptions are not
on the face of it brought about by our own mental activity, and also a
certain exposure to contingency in our relations with objects. Thus we do
not perceive objects unless they physically affect our sense organs.
A particular and highly important iwist of Kant’s philosophy is that the
nature of our capacity to be affected by objects, our sensibility, already
determines certain characteristics of our intuitions. These are said to be
the form of our intuition in general. Among them is spatiotemporality.
This must be understood to mean that the nature of the mind determines
that the objects we intuit should be spatial and temporal, and indeed
intuited as such. The intuition which plays a role in mathematics, which is
not the direct result of the affection of our mind by objects, expresses an
intuitive insight which we have into our forms of intuition and is in that
sense still an intuition of sensibility. It is apparently also sensible intuition
in the sense of being intuition of inner sense.
As Hintikka rightly emphasizes, this intrinsic connection between in¬
tuition and sensibility does not come directly from the concept of intuition
but represents a characteristic of man, or more generally of finite intel¬
ligences. Such an intelligence derives the content of its consciousness from
outside with the resulting exposure to contingency and the necessity of
concepts in order to represent objects not present. Thus not only sensibility
but also thinking, or consciousness through concepts (knowledge through
concepts, B 94), are characteristics of finite intelligences. The alternative is
an “intuitive understanding” whose activity would create the objects of its
awareness. Its awareness would be only intuition; it is called “intellectual
intuition” because it has the spontaneity which for us is characteristic of
thought and because the unity which with us is the result of synthesis of the
given is for it already present in the intuition. It seems clear that intellectual
intuition would satisfy the immediacy criterion.
Let us now turn to Kant’s views on logic. What must strike a person
with modern training most forcefully in considering Kant’s outlook on
logic is the limitation of his knowledge of and conception of it. Kant learned
and taught the established logical lore at a very uncreative time in the
history of the subject. Thus the formal logical analysis he undertakes is
pretty well limited to the categorical proposition-forms of the theory of
the syllogism, with gestures toward hypothetical and disjunctive proposi-
572 4. HISTORICAL STUDIES

tions. The inferences which are covered are the syllogisms and immediate
inferences of the Aristotelian theory and a few propositional inferences
such as modus ponens. Of propositional logic as an additional developed
theory, or of the additional possibilities of quantification theory, Kant had
no idea.
Kant not only had very limited technical resources at his command;
what is more striking and more damaging to his standing as a philosopher,
he was largely satisfied with logic as he found it. Technically he could
hardly in any case have gone very far beyond the state of the science in his
own time, and he was not a creative mathematician. But what would have
been needed for Kant to be dissatisfied with “traditional logic” might only
have been more insight into his own discoveries.
As is well known, Kant attributed the lack of progress in logic to the
absence of any need for it. He held that logic was established as a science
and then finished off once and for all by Aristotle. This is a false view not
only of the possibilities of discovery in logic but also of the history of the
subject, which, far from not being “able to advance a single step” nor
“required to retrace a single step” since Aristotle, had done both more than
once. Kant’s opinion was also influential and served to create resistance
to more reasonable views both of logic itself and of its history.
Why Kant should have thought the science of logic both completable
and completed is a question which I shall not attempt to answer here. I
do not know whether a serious effort to answer it would uncover interesting
ideas of Kant, which as it is we do not understand. In general, it can be
said that the view harmonized extremely well with the more rationalistic
side of Kant’s way of thinking and with the belief, which he was not the
only great philosopher to hold, that his own work finished off an important
part of philosophy. Kant certainly thought that there were inexhaustible
sources of problems, even philosophical problems, for the human intellect
to wrestle with. But he held that this inexhaustibility lay within limits fixed
by a form, the basic properties of which could be exhaustively described.
This form would belong to the human faculty of thought itself, which so
long as it was dealing with “itself and its form” and not with objects given
from outside or with the manner in which they might be given from out¬
side, was bound to be capable not only of being on sure ground but of
uncovering and analyzing every relevant factor. Reason and the under¬
standing are “perfect unities” (A xiii, A67 = B92). We also find an echo of
the Cartesian idea that the self is better known than objects:

I have to deal with nothing save reason itself and its pure thinking; and to
obtain complete knowledge of these, there is no need to go far afield, since I
come upon them in my own self [A xiv].

Logic is, according to Kant, the most general of all divisions of knowl¬
edge. It applies to all objects of our thought in general, and all true state-
CHARLES PARSONS 573

ments and sound inferences must conform to it. In particular and especially
important, logical possibility is the most inclusive kind of possibility. If
something is possible in any respect whatsoever, it is logically possible; its
concept does not involve a contradiction. In particular, at least as far as
Kant’s explicit statements are concerned, the applicability of logic is not
limited by the forms of our sensibility.
The relation between logic and the forms of intuition can best be seen
by contrast with geometry: The forms of intuition provide the basis for
certain necessary truths, in particular those of geometry, in the sense that
if the forms of intuition were not as they are the truths in question would
not hold, and if we did not have a certain insight into our forms of in¬
tuition, we would not know them. The application of these truths, however,
is limited to the objects which affect our senses. Moreover, the principles
are true of these objects only as they appear and not as they are in them¬
selves.
These limitations do not obtain for logic. In particular, there are states
of affairs which are logically possible but which are excluded by the forms
of intuition, such as the existence of spatial configurations contrary to the
theorems of Euclidean geometry; so that geometry is a more special theory
than logic, not only in the sense that it deals with a more restricted type
of object but also in the sense that it makes statements about these objects
which are not logically necessary, although they are necessary in another
way.
Logic is also not subject to the great limitation of knowledge based on
intuition, that of appearances. When Kant says that it must be possible
to think of things in themselves, he implies first that such a conception does
not contradict the laws of logic, and second that in the statements we make
about them, the logical laws are still a negative criterion of truth. If he
could not trust logic in this realm, Kant’s metaphysics of morals would
not be able to get off the ground.
Already on this level, it is possible to see quite clearly some reasons why
Kant should have regarded geometry as synthetic a priori and used an idea
such as that of a form of intuition in order to explain how such a science
was possible. Geometry is a more special theory than logic first in the
sense that it contains nonlogical primitives, second in that its theorems
cannot in general be proved merely by means of definitions and logic,
as Leibniz apparently thought. Indeed this is much more obvious to us than
in Kant’s time, given that we have non-Euclidean geometry and are in
general less tempted to overestimate the power of logic, especially tradi¬
tional logic. It is worth pointing out that Euclid’s postulates are what are in
effect existence assumptions, so that here Kant’s general views about
existence would imply that they could not be analytic.
That Kant should then found geometry on the form of our sensible
intuition is not difficult to understand. On the one hand spatiotemporality
574 4. HISTORICAL STUDIES

is a characteristic property of the objects given to the senses. Moreover,


Kant emphasized that space was an individual the notion of which was
understood in a way analogous to ostension, and the same ostensive under¬
standing would be necessary for the particular primitives of geometry. On
the other hand, Kant started from the idea that geometry was a body of
necessary truths with evident foundations. That the axioms of geometry
should be empirically verified directly is contrary to their necessity; that
they should be some sort of high-level hypotheses is contrary to their
evidence.
The second observation to make about Kant’s views on logic is that
he never suggests a conventionalist account of logical validity. It is true
that the very general character of logical and analytical truths goes with
their uninformativeness. They reflect the nature of the mind and of certain
particular concepts, and apparently not at all how the world is otherwise.
But this nature and the manner in which particular concepts give rise to
the analytic truths that they do seem to be something given, which will be
in fact the same for all discursive intelligences, even if their forms of in¬
tuition are quite different from ours.
Kant does not give much explanation of how this is, and perhaps he felt
some doubt as to the possibility of giving such an explanation. If we try to
apply the insight which we might get from the “Transcendental Deduction”
to this question, we get into a very difficult dilemma. Namely the essential
activity of the understanding seems to be in relation to material given in
intuition, to bring it to the unity expressed in an objective judgment. In
other words, the notions of object, concept, judgment get their whole sense
from their application to experience. Nonetheless the understanding has a
greater generality than intuition: The forms of intuition are not logically
necessary; and in operating logically with a given notion, it is not necessary
to appeal to intuition or even to suppose that the notion has an intuition
corresponding to it. It is possible in some way for us to recognize that what
can be given in experience is not the whole of possible reality, and even to
recognize that with the help of intuition we can know objects only in a
relative way, as they appear. All this points, even apart from the require¬
ments of Kant’s moral philosophy, to the presence in us of more general
conceptions of object, concept, judgment, and a fortiori inference. This
dilemma will occupy us again later, since it has an application to the prob¬
lems of arithmetic.
With respect to Kant’s philosophy of geometry, the difficulties do not
concern why Kant thought geometry to be a priori intuitive knowledge,
but rather whether this is true and what precisely the theory was by which
he proposed to explain how it could be true. When we turn to Kant’s
philosophy of arithmetic, there is even less difficulty as to why he should
have thought arithmetical propositions a priori. But it is already by no
means easy to see why Kant regarded them as synthetic, as based in some
CHARLES PARSONS 575

way on our forms of intuition, in particular on the form of inner intuition,


time, and as limited in their application to appearances.
It will become clearer why Kant regarded arithmetical propositions as
synthetic if we observe that Kant’s concept of analytic proposition most
likely had a much narrower extension than the corresponding concept in
more recent philosophy, e.g., in Frege and logical positivism. Kant does
not formulate his concept with enough precision so that we can be alto¬
gether sure about this. But it seems rather clear from the examples that
when Kant speaks of the concept of the predicate of an analytic judgment
as contained in that of the subject, the situation is analogous to that in
which the subject concept is defined by the conjunction of the predicate
concept with perhaps certain others. This would be a paradigm case where
the connection of subject and predicate is “thought through identity” (A 7
= B 11). An idealized version of an analytic judgment would be one of
the form ‘All AB are A\ or ‘All C are A\ where ‘C’ is defined as ‘A
and B\ This is idealized because, according to Kant, outside mathematics
concepts do not in general have definitions in the proper sense.
It seems certain that a number of other forms would have to be admitted
as analytic, e.g., ‘No AB are not A’ or the propositional ‘If p and q, then
p’.7 But there is no particular reason why ‘7 + 5 = 12’ should be. Kant
says (B 15) that for ‘7 + 5 = 12’ to be analytic, it would have to follow
from the concept of a sum of 7 and 5 by the law of contradiction. This
would be as if it were provable from definitions by a very restricted logic,
probably included in the limited traditional apparatus at Kant’s command,
and it is hard to see how it could be true otherwise.
However, it is one thing to say there is no reason to expect this and
another to understand Kant’s specific reason for thinking it false. Kant
indicates that the way you find out that 7 + 5 = 12 is by a process like
counting, of progressing from 7 to 12 by successive additions of 1, in which
one must operate with a particular instance of a group of 5 objects, which
can only be given in intuition.

We have to go outside these concepts, and call in the aid of the intuition which
corresponds to one of them, our five fingers, for instance, or, as Segner does in
his Arithmetic, five points, adding to the concept of 7, unit by unit, the five given
in intuition. For starting with the number 7, and for the concept of 5 calling
in the aid of the fingers of my hand as intuition, I now add one by one to the
number 7 the units which I previously took together to form the number 5,
and with the aid of that figure [the hand] see the number 12 come into being
[B 15=16],

It is, however, still not clear why that process cannot be either itself
put in the form of a purely logical argument or replaced by something
quite different which can.
There was an attempt to do just this with which Kant was in a position
to be familiar, by Leibniz in the Nouveaux Essais.8 Leibniz worked with
576 4. HISTORICAL STUDIES

‘2 + 2 = 4’, but the type of argument suffices for any addition formula.
He assumed as an axiom the substitutivity of identity, which Kant would
in all probability have regarded as analytic. Leibniz took as definitions

2=1 + 1,
3=2 + 1,
4 = 3 + 1,

which is approximately what is done in modern formalizations. Then the


proof goes as follows:

2 = 2 + 1 + 1 (def. of “2”)
= 3 + 1 (def. of “3”)
= 4 (def. of “4”)

The standard modern objection to this argument is that Leibniz should


have inserted brackets, so that it goes

2 + 2 = 2+ (l + 1) = (2 + 1) + 1 = 3 + 1

and therefore assumes an instance of associativity. We cannot exclude the


possibility that this was known to Kant when he was working on the
Critique of Pure Reason, since it occurs in effect in the book Prufung der
kantischen Kritik der reinen Vernunft, vol. I (Konigsberg, 1789), by Kant’s
pupil Johann Schultz, professor of mathematics in Konigsberg.
Putting great weight on the evidence of writings by Schultz and other
disciples of Kant, Gottfried Martin has put forth the hypothesis that Kant
envisaged an axiomatic foundation of arithmetic similar to the classical
axiomatizations of geometry.9 He sees the claim that arithmetic is syn¬
thetic as resting on the first instance on the logical point that arithmetical
propositions such as ‘7 + 5 = 12’ cannot be proved by mere logic from
definitions such as those Leibniz uses. An axiomatic foundation of the sort
which would answer to Martin’s ideas is given in Schultz’ Prufung. Without
explicitly mentioning Leibniz, Schultz points out that the sort of proof of
an arithmetical identity that Leibniz gives rests on the assumption of asso¬
ciativity. He gives, for ‘7 + 5 = 12’, an argument which also rests on
commutativity, and seems, wrongly, to think this assumption unavoidable.
But of course commutativity has to be used sooner or later in arithmetic.10
Schultz gives two axioms, the commutativity and associativity of addition.
He neither asserts nor denies the independence of the corresponding laws
of multiplication and of the distributive law. He also gives two “postulates”
which are worth quoting in full:

1. From several given homogenous quanta, to generate the concept of one


quantum by their successive connection, i.e., to transform them into one whole.
2. To increase and to diminish any given quantum by as much as one wants,
that is, to infinity.n
CHARLES PARSONS 577

The second postulate implies that Schultz is not thinking specifically


of the arithmetic of integers but also of continuous quantities. In any case,
the first postulate is the basis for the supposition that the function of ad¬
dition is defined; i.e., given numbers m, n, there actually exists a number
m + n.
If we accepted this as actually giving Kant’s conception, there would
still remain the question how intuition enters into the foundation of these
axioms and postulates. About this Schultz has in fact something to say. But
in transferring the conception to Kant we are faced immediately with the
difficulty that he explicitly says that arithmetic does not have axioms.

As regards magnitudes (quantitas), that is, as regards the answer to be given to


the question, “What is the magnitude of a thing?” there are no axioms in the
strict meaning of the term, although there are a number of propositions which
are synthetic and immediately certain (indemonstrabilia) [A 163 —4 = B 204].

He considers two possibilities, rules of equality, which he asserts to be


analytic (a proper axiom must be synthetic), and the elementary arith¬
metical identities, such as ‘7 + 5 = 12’, which are what he seems to be
referring to at the end of our quotation, which are indeed synthetic and
indemonstrable, but which he declines to call axioms because they are
singular.
This position is reaffirmed in a letter from Kant to Schultz dated No¬
vember 25, 1788,12 in which he comments on the manuscript of volume I
of the Priifung. There he gives a reason, which I shall mention later, why
arithmetic should not have axioms. He does say that arithmetic has postu¬
lates, “immediately certain practical judgments.” The general tone of his
discussion suggests that he might regard the general directive to carry out
addition, on the proposition that this can always be done, as postulates, i.e.,
that he might accept Schultz’ first postulate. But what he seems to have
specifically in mind is what he elsewhere calls numerical formulae, i.e.,
7 + 5 = 12.
We cannot be certain, however, that the mathematical material of the
published version of the Priifung was present in the manuscript that Kant
was commenting on. For it seems from the letter, as Martin points out,13
that the manuscript maintained that arithmetical propositions were analytic,
and thus it is clear that it was considerably revised after Schultz received
Kant’s letter. The fact that in the published version the axiomatic analysis
is used to support the conclusion that arithmetic is synthetic does not prove
that it was not present in the manuscript, although the supposition that the
postulates were there is a bit strained. But that Schultz might have argued
that the commutative and associative laws were analytic is not at all impos¬
sible. (Leibniz argued this at least for commutativity.14)
Even so, unless one accepts Martin’s rather unlikely idea that the
578 4. HISTORICAL STUDIES

axiomatic analysis was contributed by Kant to Schultz after the letter, it


is hard to escape the conclusion that Schultz understood the mathematical
issue in at least one respect better than Kant himself: Kant does not seem
to have had an alternative view of the status of such propositions as the
commutative and associative laws of addition. He can hardly have denied
their truth, and it seems that if they are indemonstrable, they must be
axioms; if they are demonstrable, they must have a proof of which he gives
no indication.
If when speaking of the axiomatic character of arithmetic, Martin means
that according to Kant arithmetic must make use of propositions which
cannot be deduced by logic and definitions, then there can be no disagree¬
ing with him. But if he means that Kant had in mind setting up arithmetic
as an axiomatic system of which Schultz’ is a very primitive instance and
that it is in the verification of such laws as the commutative and associative
that the primary application of intuition in arithmetic is to be found, then
Kant’s actual words go against him.
Even if Martin’s view of the matter is quite correct as far as it goes, it
cannot satisfy us. In the first place, it does not answer the question why
arithmetic should depend on intuition, except in the sense, entirely bound
to the primitive level of axiomatics in Kant’s time, that so far as one can
see the obvious alternative is insufficient. In the second place, it carries
over to arithmetic the considerations which were at work in geometry
while our original sense of difficulty arose from the difference between the
two. And there are many indications, in particular some remarks in the
letter to Schultz, which I shall discuss, that he saw some of this difference
and did not intend to give an entirely symmetrical account.
The problem of the asymmetry of arithmetic and geometry could be
solved by an interpretation suggested by E. W. Beth15 and developed by
Hintikka.16 From their interpretation it seems to follow that if a proposition
B of geometry is proved by a proof which appeals to axioms Ax . . . An (I
here include postulates17), then in general the conditional Ax & . . . & An .
D B is synthetic; at any rate an appeal to intuition is made over and above
any which is made in verifying the axioms. One could then argue that
since arithmetic according to Kant does not have axioms, only the first
type of appeal to pure intuition occurs in arithmetic.
Beth s and Hintikka s hypothesis is that for Kant certain arguments
which can nowadays be formulated in first-order predicate logic involve an
appeal to intuition. In view of the singularity criterion for intuition, the
natural candidates for such arguments are arguments involving singular
terms. For Beth the form of argument involved is illustrated by the proof
that the base angles of an isosceles triangle are equal:

We proceed, as is well known, as a rule as follows: first we consider a par¬


ticular triangle, say ABC, and suppose that AB=AC; then we show that
4!ABC=^ACB and have thus proved that the assertion holds in the par-
CHARLES PARSONS 579

ticular case in question. Then one observes that the proof is correct for an
arbitrary triangle, and therefore that the assertion must hold in general.18

The general form of the argument is as follows: We want to prove


4(-*) (Fx D Gx)\ We assume a particular a such that Fa. We then deduce
Ga’. We then have ‘Fa DGa’ independently of the hypothesis. But since a
was arbitrary, ‘(*) (FxDGx)’ follows.
This form of argument, as for example in Beth’s case, is the character¬
istic form of a proof in Euclid. In the Discipline of Pure Reason in Its
Dogmatic Employment, the section of the Critique where Kant sets forth
his view of mathematical proof as proceeding by “construction of concepts
in pure intuition,” this form of argument appears clearly in the geometrical
example (A 716 — 7 = B 744 — 5). The geometer “at once begins by con¬
structing a triangle.” By a series of constructions on this triangle and ap¬
plications of general theorems to it “through a chain of inferences guided
throughout by intuition he arrives at a fully evident and universally valid
solution of the problem.”
Hintikka concentrates attention rather on the rule of existential in¬
stantiation, that is on arguments of the form

(3 x)Fx
Fa

where a is introduced to indicate an F, in view of the fact that the previous


line affirms that there are F’s.19
Both of these arguments have in common that they turn on the use of a
free variable which indicates any one of a given class of objects, so that
an argument concerning it is valid for all objects of the class. They thus
have a formal analogy with the appeal to pure intuition, in that a singular
term is used in such a way that what is proved of it can be presumed
generally valid. Moreover, the manner in which this generality is assured,
namely by not allowing anything to be assumed about a except what is
explicitly stated in premises, is reminiscent of a statement of Kant about
the role of a constructed figure in a proof:

If he is to know anything with a priori certainty he must not ascribe to the


figure anything save what necessarily follows from what he has himself set
into it in accordance with his concept [Bxii],

It is noteworthy that in traditional algebra calculations are carried out


on terms and formulae with free variables, where the derivation of such an
equation serves to prove a general proposition. Hintikka interprets the
580 4. HISTORICAL STUDIES

rather obscure remarks about “symbolic construction” in algebra in this


sense.20
It would naturally follow from the conception of intuition as simply
individual representation that the mere form of these arguments is such
that they involve intuition. Of course, it would not give any plausibility to
Kant’s more far-reaching philosophical theses which turn on the connection
of mathematics with the form of sensibility. Thus the philosophically in¬
teresting aspects of the concept of pure intuition seem to lose their point
when it is pointed out that these arguments can be formalized in pure
quantification theory. This is exactly the conclusion which Beth draws.
One might object that this seems to presuppose that logic itself does not
pose philosophical problems which the notion of pure intuition might be
needed to answer, but on this at least Beth is in agreement with Kant in
most of his utterances. But anyway it seems unlikely that the break between
arguments which turn on the generality interpretation of free variables and
logical arguments which do not is the philosophically most significant
break within mathematical proof.21
One could wish more clear-cut evidence for the attribution of such a
view to Kant or even for the most modest thesis that he started with this
idea in developing his philosophy of mathematics. If it was his mature
view, Kant s mathematically astute pupil Schultz seems not to have sus¬
pected it since there is no suggestion of it in the Priifung. Schultz took for
granted that an adequate axiomatization would be such that if the axioms
were analytic so would be all the theorems. Mathematics fails to be analytic
just because in its deductive development synthetic premises must be used.
The same view is expressed by Kant when he says:

F- arS 11,found ,that aI1 mathematical inferences proceed in accordance


with [nach] the principle of contradiction (which the nature of all apodictic
certainty requires), it was supposed that the fundamental propositions of the
science can themselves he known to be true through that principle. This is an
erroneous view. For though a synthetic proposition can indeed be discerned
in accordance with the principle of contradiction, this can only be if another
syntheuc proposition is presupposed, and if it can then be apprehended as
following from this other proposition [B 14]. F

Against this, it is pointed out22 that Kant says of a geometric proof


that it proceeds “through a chain of inferences guided throughout bv
intuition” (A 716-7 = B 744-5). In view of the description Kant gives
of the proof, this could easily mean that in the course of the proof one is
constantly appealing to the evidences formulated in the axioms and postu¬
lates It would obviously be anachronistic to attribute to Kant a picture of
proof modeled on a formal deduction where the axioms are stated at the
beginning and everything else is logic and where the purpose is not to show
the truth of the proposition proved but merely that it follows from the
CHARLES PARSONS 581

axioms. On the contrary, for Kant a Euclidean proof is convincing because


on each particular application of an axiom or postulate the correctness of
what it claims in this particular case is evident.
It must be conceded that it might be true that inference by certain rules
from analytic premises might yield analytic conclusions while inference
according to the same rules from synthetic premises could lead to con¬
clusions which are not only themselves synthetic but such that the con¬
ditional of premises and conclusion is also synthetic. In particular, the rule
of existential instantiation can only come into play in the presence of an
existential quantifier, and it is not clear that, for Kant, a statement in
which an existential quantifier occurs essentially can be analytic. I can only
say that in such cases the text of Kant does not clearly indicate that the
necessity of an appeal to intuition arises for the inference and not merely
for the verification of the premise.
If Hintikka were right, one could expect that in the passages on algebra
the role of variables would be emphasized. It is possible to find this em¬
phasis in the passage on A 717 = B 745, but it is not really explicit. The
emphasis of A 734 = B 762 seems different, where Kant says, “The con¬
cepts attached to the symbols, especially concerning the relations of mag¬
nitudes, are presented in intuition” (emphasis mine). The relations would
seem to be expressed by algebraic function signs. Although the passages
on algebra offer some support for Hintikka’s theory, it is less than decisive.
I shall show that there are other possible ways of looking at these passages.
The direct evidence thus seems to me on the whole opposed to the Beth-
Hintikka theory. However, it would have strong indirect support if there
were not other ways to explain how arithmetic can require pure intuition
and to interpret the notion of “construction of concepts,” especially in
algebra. To this end we now return to the problem of the difference between
arithmetic and geometry.
The difficulty can be put in this way: The synthetic and intuitive char¬
acter of geometry gets a considerable plausibility from the fact that geometry
can naturally be viewed as a theory about actual space and figures con¬
structed in it. This space is related to the senses by being a field in which
the objects given to the senses appear, and geometry seems to give quite
substantial information about this space which from the point of view of
abstract thought might be false.
The content of arithmetic does not immediately suggest such a special
character or such a connection with sensibility. Of course in the first in¬
stance it speaks of numbers and purely abstract operations and relations—
equality, addition, subtraction, etc. Then the question is—what is the field
of application of numbers? That is, what sorts of things can be counted,
assigned cardinal or ordinal numbers, or measured and thus assigned con¬
tinuous quantities? On the face of it, there is no reason to believe that the
582 4. HISTORICAL STUDIES

application of arithmetic need be to objects in space and time. Although


this has certainly become more evident since the rise of abstract mathe¬
matics, that mathematical objects themselves could be numbered was some¬
thing which Kant was certainly in a position to be aware of. If the application
of arithmetic is to be limited to appearances, this limitation has to be
understood rather broadly in order to reconcile it with obvious facts.
In the case of geometry, it was possible to mention logical possibilities
which the concepts allowed but which did not exist according to the mathe¬
matical theory; Kant gives the example of a two-sided plane figure, and
many more such possibilities were opened up in the development of non-
Euclidean geometry. It was probably impossible in Kant’s time to be clear
about whether such a possibility exists in arithmetic. If it did, it would
give rise to a clear separation of arithmetical from logical truth. This sort
of argument was not available to Kant. The difficulty is made more acute,
some would say insoluble, by subsequent developments in logic, particularly
the efforts of Frege and others to do just what Kant thought impossible
to reduce arithmetic to logic, to deduce arithmetical propositions from
definitions and propositions of pure logic.
Of course the extent of what counts as “logic” here is considerably
wider than what Kant regarded as such. At the very least, we need for this
type of construction to incorporate some of the theory of classes into logic;
not just the notion of class and some elementary operations concerning
them, but also at least some modest axioms of class existence—how modest
depending on how much arithmetic one wants to deduce.
Both to set forth what we need of this construction for our purposes
and to indicate how far one can go without using set-theoretic devices, I
shall discuss a logical truth which is closely related in meaning to ‘2 + 2 =
4’ and provides the key to the proof of ‘2 + 2 = 4’ in more extended
formalisms. This example will help to indicate how far the cases of arith¬
metic and geometry are symmetrical.
Consider the following schema of the first-order predicate calculus with
identity:
(3x)Fx-(3 x)Gx-(x)—(Fx-Gx)- D (3 jc) (FxvGx) (1)
2 2 4
where ‘(3 x) Fx’ is an abbreviation for ‘—(3 x)Fx’ and ‘(3 x)Fx’ for
0 n+1
(3 x)[Fx-(3 y)(Fy.y^x)].
n
so that ‘(3 x)Fx’ can be expanded as
2
(3 x) (3 y) [Fx • Fy • x 7^ y • (u) (Fu D • u - x v u = y)] (2)
and ‘(3 x) (Fx v Gx)’ as
4
CHARLES PARSONS 583

(3x)(3y) (3z) (3w) [Fx v Gx ■ Fy v Gy • Fz v Gz • Fw v Gw.


Xy^y-x^Z’X^w-yy^z-y^w-Zy^w.
(m) (Fm v Gu • D 'ii = xvu = yvu = zvu = w)]. (3)
Intuitively, the proof of this schema goes like this: Suppose (3 x)Fx and
2

x, y, z, and w such that


Fx • Fy • x y « (u) (Fu D • u = * v w = y)
Gz • Gw • z =+ w • (u) (Gu D • u = zv u = w).
We then go out to argue, with the help of ‘O)—(Fx-Gx)\ that x, y, z, w
satisfy the condition in the scope of (3), and so we infer that there are
x, y, z, w such that this condition holds
This schema requires for its formulation only predicate letters, variables,
quantifiers, identity, and logical connectives. The only notion involved which
could possibly be different in principle from what Kant regarded as gen¬
eral logic is identity, and since that is used in application to quite arbitrary
objects, it does not immediately suggest a restriction as to application as
the geometrical concepts do. Moreover, the schema is proved without the
application of existence axioms: The range of values of the variables can
be any universe whatsoever, even the empty one.23
Frege and his twentieth-century followers certainly thought that by their
construction they had refuted the view that arithmetic depends in any
way on “pure intuition,” sensibility, or time. Thus the temporal notion of
the successive addition of units, or the even more concrete one of com¬
bining groups of objects, is replaced in Frege’s construction by the timeless
relation of one class being the union of two others, which can be defined
in terms of the logical connective alternation as it occurs in (1). Moreover,
the construction provides a framework for the application of the concept of
number far beyond the scope of concrete appearances, in particular, in the
elaboration of set theory.
Analogous to a non-Euclidean space would be a possible world in which
the arithmetical identities turned out differently, for example, in which
2 + 2 = 5. But would that not be a world in which there was a counter¬
instance to our schema, and therefore in conflict with logic? Only, of course,
if the connection of meaning between ‘2 + 2 = 4’ and the schema (or
‘2 + 2 = 5’ and a similar schema) is preserved. I am inclined to regard
the breaking of this connection as a change in the meaning of addition.
There is, however, one way out of this dilemma. With ‘2 + 2 = 5’ we
would associate the schema
(3 x)Fx-(3 x)Gx- (x)—(Fx-Gx) ■ D (3 x)(FxvGx) (4)
2 2 5
Now suppose we had a universe U in which for any choice of extensions of
‘F’ and ‘G’ this schema came out true. Even according to our notions of
584 4. HISTORICAL STUDIES

logic, there is a possible case in which this happens, and in which (since
(1) is valid) there is also no conflict with (1), namely in which U contains
fewer than four elements. In that event the antecedent of the above would
always be false.24
If one considers the minimal existence axioms which would be needed to
prove the categorical ‘2 + 2 = 4’ in modern set theory, we find that again
they require the universe to contain at least four elements, which can be
identified with the numbers 1,2, 3, 4.
If we accept first order quantification theory with identity as a logical
framework, then it seems that we can maintain the symmetry of arithmetic
and geometry in a weak sense, that such propositions as ‘2 + 2 = 4’ imply
or presuppose existence assumptions which it is logically possible to deny.
To draw the line at this point and to declare thus that set theory is not
logic seems to me eminently reasonable; but I shall not argue for this now,
particularly since I have done so elsewhere.25
I think the presence of existence propositions in mathematics is one of
the considerations at stake in Kant’s views on mathematics, but it is not
clearly differentiated from others. His general views on existence imply that
existential propositions are synthetic, but he never applies this doctrine
directly to the existence of abstract entities. In the letter to Schultz cited
above, Kant says that arithmetic, although it does not have axioms, does
have postulates. Postulates as to the possibility of certain constructions,
for Kant constructions in intuition, played the role of existence assumptions
in Euclidean geometry. Schultz states as a postulate in the Priifung essen¬
tially that addition is defined.
This factor is also present in Kant’s remarks about “construction of con¬
cepts in pure intuition,” which he regards as the distinguishing feature of
mathematical method. If the geometer wants to prove that the sum of the
angles of a triangle is two right angles, he begins by constructing a triangle
(A 716 = B 744). This triangle, as we indicated above, can serve as a
paradigm of all triangles; although it is itself an individual triangle, nothing
is used about it in the proof which is not also true of all triangles. The proof
consists of a sequence of constructions and operations on the triangle.
Kant’s view was that it is by this construction that the concepts involved
are developed and the existence of mathematical objects falling under them
is shown. Although we need not regard this theorem as implying or pre¬
supposing that there are triangles, Kant regarded a general proposition as
empty, as not genuine knowledge, if there are no objects to which it applies.
In this instance only the construction of a triangle can assure us of this.
Apart from that, further existence assumptions are used in the course of
the proof, in the example of A 716 = B 744 of extensions of lines and of
parallels.
The same factor is also suggested in the rather puzzling passage in which
Kant says that the operation with variables, function symbols, and identity
CHARLES PARSONS 585

in traditional algebraic calculation involves “exhibiting in intuition” the


operations involved, which he calls “symbolic construction.” In fact, such
operation presupposes that the functions involved are defined for the argu¬
ments we permit ourselves to substitute for the variables. Moreover, the
construction of an algebraic expression for an object to satisfy a certain
condition is the very paradigm of a constructive proof of the existence of
such an object. However, I think there is something else at stake in this
passage, which I shall come to.
It is by no means obvious that the existence assumptions which must be
made in the deductive development of mathematics have any connection
with sensibility and its alleged form. Frege for one was quite convinced
that they did not. What Kant says that bears on this point is not completely
clear, partly because in the nature of the case it is bound up with some
difficult notions in his philosophy, partly because again he did not dis¬
engage this issue from some others.
As a preliminary remark, we must observe that Kant certainly did not
regard arithmetic as a special theory of, say, time, in the sense in which
he regarded geometry as a special theory of space. It does not turn up
in this connection in the proofs of the apriority of time in either the
Aesthetic or the corresponding discussion in the Inaugural Dissertation
(§ 12, § 14 no. 5).
Nevertheless it is clear that according to Kant, the dependence of arith¬
metic on the forms of our intuition is in the first instance only on time. I
should venture to say that space enters the picture only through the gen¬
eral manner in which inner sense, and thus time, depends on outer sense,
and thus space. We shall be clear about the intuitive character of arith¬
metic when we are clear about the manner in which it depends upon time.
Whenever Kant speaks about this subject, he claims that number, and
therefore arithmetic, involves succession in a crucial way. Thus in arguing
that intuition is necessary to see that 7 + 5 = 12:

For starting with the number 7, and for the number 5 calling in the aid of the
fingers of my hand as intuition, I now add one by one to the number 7 the units
which I previously took together to form the number 5, and with the aid of
that figure [the hand] see the number 12 come into being (B 15-16, emphasis
mine).

When he gives a general characterization of number in the Schematism,


the reference to succession occurs essentially:

The pure image of all magnitudes (quantorum) for outer sense in space; that of
all objects of the senses in general is time. But the pure schema of magnitude
(quantitatis), as a concept of the understanding, is number, a representation
which comprises the successive addition of homogeneous units [A 142=B 182].

As I said, this seems to conflict not only with the interpretation which
number and addition acquire in such constructions as Frege’s, in which
586 4. HISTORICAL STUDIES

instead of the successive addition of “units” we have a timeless relation,


for example, that one set is the union of two others; but also with the ap¬
plication of these notions within modern mathematics, in which arithmetical
statements can be made about structures which are entirely timeless, and
in reference to which any talk of “successive addition” is on the face of it
entirely metaphorical.
In the letter to Schultz, Kant qualifies his position in a way which does
more justice to this more general character of arithmetic:
Time, as you quite rightly remark, has no influence on the properties of
numbers (as pure determinations of magnitude), as it does on the property
of any alteration (as a quantum), which itself is possible only relative to a
specific condition of inner sense and its form (time); and the science of number,
in spite of the succession, which every construction of magnitude [Grosse]
requires, is a pure intellectual synthesis which we represent to ourselves in
our thoughts.
Earlier in the letter he writes:
Arithmetic, to be sure, has no axioms, because it actually does not have a
quantum, i.e., an object of intuition as magnitude, for its object, but merely
quantity, i.e., a concept of a thing in general by determination of magnitude.

Kant is here in fact reaffirming a position affirmed in the Dissertation:


To these there is added a certain concept which, though itself indeed intel¬
lectual, yet demands for its actualization in the concrete the auxiliary notions
of time and space (in the successive addition and simultaneous juxtaposition of
a plurality), namely, the concept of number, treated of by arithmetic [§ 12].

These remarks place arithmetic less on the intuitive and more op the
conceptual side of our knowledge. If arithmetic had for its object “an
object of intuition as magnitude,” i.e., forms such as the points, lines, and
figures of geometry, then it would refer quite directly to a form of intuition.
But instead it refers to “a concept of a thing in general”; the science of
number is a “pure intellectual synthesis.” This latter phrase especially sug¬
gests that arithmetical notions might be definable in terms of the pure cate¬
gories and thus be associated with logical forms which do not refer at all
to conditions of sensibility. Such a view would seem to conflict with the
statement of the Schematism that number is a schema.
The reference to “a concept of a thing in general” is no doubt to be
meant in the same sense as that in which the categories are said to specify
the concept of an object in general, and the pure intellectual synthesis is
no doubt that of the second edition transcendental deduction, which is the
synthesis of a manifold of intuition in general, which is for us realized so
as to yield knowledge only in application to intuitions according to our
forms of intuition. Thus the “concept of an object in general” could give
rise to actual knowledge of objects only if these objects can be given ac¬
cording to our forms of intuition.
CHARLES PARSONS 587

But does this merely mean that objects in space and time provide the
only concrete application of these concepts which we can know to exist, as
one might expect from the absence of special reference to intuition?
Whether it means this or something more drastic is, I think, a special case
of the general dilemma about the understanding which I mentioned in
the beginning. In either case, however, it would be a plausible interpreta¬
tion of Kant to say that the forms of intuition must be appealed to in order
to verify the existence assumptions of mathematics.
However, it is not very clear how to apply the general conceptions de¬
rived from the Aesthetic and the Transcendental Deduction to the case
at hand. The direct existence propositions in pure mathematics are of ab¬
stract entities, and it is only in the geometric case that they can be said
to be in space and time. I do think that the objects considered in arithmetic
and predicative set theory can be construed as forms of spatiotemporal
objects. Full set theory would of course not be accommodated in this way,
but it is not reasonable to expect that from a Kantian point of view im-
predicative set theory should be intuitive knowledge or indeed genuine
knowledge at all. It could legitimately be said to postulate entities beyond
the field of possible experience.26
It is natural to think of the natural numbers as represented to the senses
(and of course in space and time) by numerals. This does not mean mainly
that numerals function as names of numbers, although of course they do,
but that they provide instances of the structure of the natural numbers. In
the algebraic sense, the set of numerals generated by some procedure is
isomorphic to the natural numbers in that it has an initial element (e.g.,
“0”) and a successor relation which the notion of natural number requires.
In this sense, of course, the numerals are abstract mathematical objects;
they can be taken as geometric figures. But of course concrete tokens of
the first n numerals are likewise a model of the numbers from 1 to n or
from 0 to n — 1. A set of objects has n elements if it can be brought into
one-to-one correspondence with the numbers from 1 to n; a standard way
of doing this is by bringing them in some order into correspondence with
certain numerals representing these numbers, that is by counting. (The
numerals used in work in formal logic, for example where the initial element
is ‘O’ and the (n + l)st numeral is obtained by prefixing ‘S’ to the
rtth numeral, have the further property that each numeral contains within
itself all the previous ones so that the nth numeral is itself a model of
the numbers from 0 to n.)
The basis for the use of a concrete perception of a sequence of n terms
in verifying general propositions is that, since it serves as a representative of
a structure, the same purpose could be served by any other instance of the
same structure, that in any other perceptible sequence which can be placed
in a one-one correspondence with the given one so as to preserve the suc¬
cessor relation. This might justify us in calling such a perception a “formal
588 4. HISTORICAL STUDIES

intuition.” We might note that the physical existence of the objects is not
directly necessary, so that we can abstract also from that “material” factor.
An empirical intuition functions, we might say, as a pure intuition if it
is taken as a representative of an abstract structure. Such a perception
provides the fullest possible realization before the mind of an abstract con¬
cept. One of the important questions about Kant’s philosophy of arithmetic
is whether a comparable realization exists beyond the limits of scale of
concrete perception.
Before we can enter into this question, let me point out another closely
related reason in Kant’s mind for regarding mathematics as dependent on
intuition. This comes out in particular in the concept of “symbolic con¬
struction.” The algebraist, according to Kant, is getting results by manipu¬
lating symbols according to certain rules, which he would not be able to
get without an analogous intuitive representation of his concepts. The
“symbolic construction” is essentially a construction with symbols as ob¬
jects of intuition:

Once it [mathematics] has adopted a notation for the general concept of


magnitudes so far as their different relations are concerned, it exhibits in intui¬
tion, in accordance with certain universal rules, all the various operations
through which the magnitudes are produced and modified. When, for instance,
one magnitude is to be divided by another, their symbols are placed together,
in accordance with the sign for division, and similarly in the other processes;
and thus in algebra by means of a symbolic construction, just as in geometry by
means of an ostensive construction (the geometrical construction of the objects
themselves) we succeed in arriving at results which discursive knowledge could
never have reached by means of mere concepts [A 717 = B 745].

That this is a source of the clarity and evidence of mathematics and pro¬
vides a connection of mathematics with sensibility is indicated by the fol¬
lowing remark:

This method, in addition to its heuristic advantages, secures all inferences


against error by setting each one before our eyes [A 734=B 762].

A connection of mathematics and the senses by way of symbolic opera¬


tions is already claimed in Kant’s prize essay of 1764, Untersuchung liber
die Deutlichkeit der Grundsdlze der naturlichen Theologie und der Moral,27
which presents a prototype of the theory of mathematical and philosophical
method of the Discipline of Pure Reason in its Dogmatic Employment. For
example, consider the statement of the latter:

Thus philosophical knowledge considers the particular only in the universal,


mathematical knowledge the universal in the particular, or even in the single
instance, although still always a priori and by means of reason [A 714=B 742].

This distinction corresponds in the prize essay to the following, where the
distinctive role of signs in mathematics is explicitly emphasized:
CHARLES PARSONS 589

Mathematics considers in its solutions proofs, and inferences the universal in


[unter] the signs in concreto, philosophy the universal through [durch] the signs
in abstracto.28

The certainty of mathematics is connected with the fact that the signs are
sensible:

Since the signs of mathematics are sensible means of knowledge, one can know
with the same confidence with which one is assured of what one sees with
one’s own eyes that one has not left any concept out of account, that every
equation has been derived by easy rules, etc.; thereby attention is made much
easier in that it must take account only of the signs as they are known in¬
dividually, not the things as they are represented generally.29

The prize essay suggests a position incompatible with the Critique of Pure
Reason, namely that since in mathematics signs are manipulated according
to rules which we have laid down (in contrast to philosophy, where the value
of any definition turns on its having a certain degree of faithfulness to pre-
analytic usage), operation with signs according to the rules, without at¬
tention to what they signify, is in itself a sufficient guarantee of correctness.30
These passages show that a connection between sensibility and the in¬
tuitive character of mathematics existed in Kant’s mind before he developed
the theory of space and time of the Aesthetic. However, unlike in the
later work, no inference is drawn at this stage from this connection to a
limitation of the application of mathematics to sensible objects.
The general point behind the observations on symbolic construction can
be put in the following way: In general, a mathematical proposition can be
verified only on the basis of a proof or calculation, which is itself a con¬
struction in intuition. But in view of the remarks about ‘7 + 5 = 12’, a more
special fact may have influenced Kant. Certain “symbolic constructions”
associated with propositions about number actually involve constructions
isomorphic to the numbers themselves and their relations, or at least an
aspect of them. Thus in Leibniz proof that 2 + 2 = 4, ‘2 + 2’ must be
written out as ‘2 + (1 + 1)’ and the two l’s as it were added on to the
‘2’. A corresponding proof of ‘7 + 5 = 12’ would involve five such steps
instead of two.
A similar observation concerning the schema (1) has been made by a
number of writers. Although the schema does not imply that the universe
contains any elements or that any construction can be carried out, the
proof of it involves writing down a group of two symbols representing the
F’s, another such group representing the G’s, and putting them together to
get four symbols. So that it is not at all clear that ‘2 + 2 = 4 interpreted as
a proposition about the combinations of symbols is not more elementary
than the logically valid schema (1).
I have already suggested that the “symbolic” construction in generating
numerals is already enough to settle the question of their reference. In the
590 4. HISTORICAL STUDIES

same way the actual carrying out of the calculations shows the well-defined
character for individual arguments of recursively defined functions. How¬
ever, induction, which I have wanted to leave out of account here, is in¬
volved in seeing that they are defined for all arguments. Maybe Kant ought
to have said that apart from intuition I do not even know that there is such
a number as ‘7 + 5’. And it seems that one could not see by a particular
construction that there is such a number without also seeing it to be 12.
This is in agreement with Hintikka’s statement that the sense of Kant’s
statement that numerical formulae are indemonstrable is that the construc¬
tion required for their proof is already sufficient.
The considerations about the role of symbolic operations apply equally
to logic and therefore undermine Kant’s apparent wish to distinguish them
on this basis. This appears more forcefully in modern logic, where instead
of a short list of forms of valid inference one has an infinite list which must
be specified by some inductive condition. In my opinion this is a consequence
to be accepted and is even in general accord with Kant’s statements that
synthesis underlies even the possibility of analytic judgments.
The special connection of arithmetic and time can, I think, be explained
as follows: If one constructs in some way, such as on paper or in one’s head,
such a sequence of symbols as the first n numerals, the structure is already
represented in the sequence of operations and more generally in the suc¬
cession of mental acts of running through a group of n objects, as in count¬
ing. Thus time enters in through the succession of acts involved in con¬
struction or in successive apprehension. This connects with Kant’s remark
about number in the Schematism. In the operations involved in represent¬
ing a number to the senses, we also generate a structure in time which
represents the number. Time provides a universal source of models for the
numbers. In particular, Kant held that it is only by way of successively per¬
ceiving different aspects of a manifold and yet keeping them in mind as
aspects of one intuition that we can have a clear conception of a plurality.
For quite small numbers this seems doubtful although not for larger ones.
Nonetheless the element of succession appears even for the smaller ones
in the comparison involved in generating or perceiving them in order, and
the order is certainly part of our concept of number. What would give time
a special role in our concept of number which it does not have in general
is not its necessity, since time is in some way or other necessary for all con¬
cepts, nor an explicit reference to time in numerical statements, which does
not exist, but its sufficiency, because the temporal order provides a repre¬
sentative of the number which is present to our consciousness if any is
present at all.
Of course it is one thing to speak of representation in space and time and
another to speak of representation to the senses. What is represented to
the senses is presumably represented in space and time, but maybe not vice
CHARLES PARSONS 591

versa. To establish a link of these two Kant would appeal to his theory of
space and time as forms of sensibility. The relevant part of this theory is
that the structures which can be represented in space and time are structures
of possible objects of perception. The kind of possibility at stake here must
be essentially mathematical and go beyond “practical” or physical possi¬
bility.31
Consider once again a procedure for generating numerals, say by starting
with ‘0’ and prefixing occurrences of ‘S’. The actual use of these as symbols
requires that they be perceptible objects. Nonetheless we say it is possible to
iterate the procedure indefinitely and therefore to construct indefinitely
many numerals. Thus it is clear that the numerals (numeral types32) which
it is in this sense possible to construct extend far beyond the numeral-tokens
which have ever been produced in history or which could in any concrete
sense actually be used as symbols.33 This possibility of iteration is necessary
for the constructibility of indefinitely many numerals and therefore for the
infinity of natural numbers to be given by intuitive construction. Moreover,
some insight into such iteration seems necessary for mathematical induction.
Insofar as the appeal to pure intuition for the evidence of mathematical
statements is supposed to be an analogy of mathematical and perceptual
knowledge, it holds less well for propositions involving the concept of in¬
definite iteration, such as these proved by induction, than for propositions
such as 2 + 2 = 4. There seem to be two independent types of insight into
our forms of intuition which a Kantian view requires us to have, that which
allows a particular perception to function as a “formal intuition” and that
which we have into the possible progression of the generation of intuitions
according to a rule. To speak of a peculiar kind of intuition in the second
case seems quite misleading. The mathematical knowledge involved has a
highly complex relation to “intuition” in the more specifically Kantian case.
That complexity must be in some way present in the “intuitions” of space
and time since space is an individual which is given, but its structure also
determines the limits of possible experience and contains various infinite
aspects. No doubt the plausibility of the idea that space is present in im¬
mediate experience made it more difficult for Kant to appreciate the differ¬
ences of the kinds of evidences covered by his notion of pure intuition. I am
sure that more could be done to explicate the Kantian view of their con¬
nection.
In our discussion of intuition, we have somewhat lost sight of the view of
logic which at the start we attributed to Kant, which except for the question
of existence resembles the modern views called Platonist. Although Kant’s
view of intuition fits better with the modern tendencies called constructivist
or intuitionist, it seems certain that the concept of pure intuition was meant
to go with this view of logic and not to replace it. Without using notions
like “concept” and “object” in a quite general way, it is probably not pos-
592 4. HISTORICAL STUDIES

sible to describe it. It would be hasty for that reason to identify Kant’s con¬
ception of intuition with that of Dutch intuitionists, although Brouwer s
undoubtedly shows some affinity. It would also be hasty to regard Brouwer’s
critique of classical mathematics as altogether in accord with Kantianism.

NOTES
1. An earlier version of this paper was written while the author was George
Santayana Fellow in Philosophy, Harvard University, and presented in lectures in
1964 to the University of Amsterdam and the Netherlands Society for Logic and
the Philosophy of Science. I am indebted to J. J. de Iongh, J. F. Staal, and G. A.
van der Wal for helpful comments. I am also grateful to Jaakko Hintikka for
sending me two unpublished papers on the subject of this paper.
2. i.e., 1st edition, p. 320, 2nd edition, pp. 376—377. All passages are quoted
in the translation of Norman Kemp Smith (London, 1929) with slight modifications.
Other translations from German are my own. Translations of Kant’s Inaugural
Dissertation are by John Handyside, in Kant’s Inaugural Dissertation and Early
Writings on Space, Chicago and London, 1929.
3. Kants Gesammelte Schriften, ed. by the Prussian Academy of Sciences,
Berlin, 1903-1956, IX, 91. This edition will be referred to as “Ak.”
4. “Kant’s ‘new method of thought’ and his theory of mathematics,” Ajatus 27
(1965), 37-47, p. 43. Hintikka argues in detail for this thesis in a paper, “On Kant’s
notion of intuition (Anschauung),” to appear in a volume of essays on Kant edited
by Terence Penelhum and J. H. Macintosh (Belmont, Calif.: Wadsworth). The same
idea seems to underlie the analysis of Kant’s theory of mathematical proof in E. W.
Beth, “Uber Lockes ‘allgemeines Dreieck’” in Kant-Studien 48 (1956-1957), 361—
38°.
5. “It is a mere tautology to speak of general or common concepts” (“Logic”
paragraph 1, Ak. IX 91).
6. One might attribute to Kant the view that there are no such representations.
The classification Kant makes in A 320 = B 376 — 7 and Logic § 1 is of Erkenntnisse,
which Kemp Smith translates as “modes of knowledge” but which in many contexts
would be more accurately though inelegantly translated as “pieces of knowledge.”
Then the relation of a representation to its object is that through which one can know
its object, and it might be held that intuition in the full sense is the only singular
representation which can provide such knowledge. This view would have the
perhaps embarrassing consequence that an object which is not in some way perceived
is not really known as an individual.
7. Cf. the examples of “truths of reason” given by Leibniz, Nouveaux Essais,
IV, ii, § 1.
8. Ibid. IV, vii, § 10.
9. Arithmetik und Kombinatorik bei Kant (Diss. Freiburg 1934), Itzehoe 1938;
Kant’s Metaphysics and Theory of Science, Manchester, 1953, ch. i; Klassische
Ontologie der Zahl, Kant-Studien Erganzungsheft 70, Koln, 1956, § 12.
10. Neither Leibniz nor Schultz seems to mention the fact that in order to prove
formulae involving multiplication, such as ‘2 • 3 = 6’, one also needs instances of the
distributive law.
11. 1. Aus mehrern gegebenen gleichartigen Quantis durch ihre successive
Verkniipfung den Begriff von einem Quanto zu erzeugen, d. i. sie in ein Ganzes
zu verwandeln. 2. Ein jedes gegebenes Quantum, um so viel, als man will, d.i. sie
ins Unendliche zu vergrossern, und zu vermindern (Prufung, I, 221).
12. Ak. X 554-558.
13. Arithmetik und Kombinatorik bei Kant, p. 57.
14. C. J. Gerhardt (ed.), Leibnizens mathematische Schriften, Halle, 1849-1963,
VII 78. Leibniz gives a definition of addition from which he claims commutativity
follows immediately. One could read his argument as deriving the commutativity of
addition from the commutativity of set-theoretic union.
15. “Uber Lockes ‘allgemeines Dreieck’.”
16. “Kant’s ‘new method of thought’,” “On Kant’s concept of intuition,” also
“Are logical truths analytic?” Philosophical Review 74 (1965), 178-203, “Kant on
the mathematical method,” The Monist, vol. 51 (1967), 352-375.
CHARLES PARSONS 593

17. It ought to be remarked that while no doubt the distinction which Kant
makes between axioms and postulates derives historically from that of “common
notions” and postulates in Euclid, Kant’s distinction does not correspond exactly
to Euclid’s. Euclid’s division is between more general principles and specifically geomet¬
rical ones. For Kant postulates are “immediately certain practical judgments,” the ac¬
tion involved is construction, and their purport is that a construction of a certain kind
can be carried out. The role they play is thus that of existence axioms. Euclid’s
common notions are all of a type which Kant asserted to be analytic propositions
(A 164=B 204, B 17), while axioms proper must be synthetic.
18. op. cit. p. 365.
19. Cf. W. V. Quine, Methods of Logic, revised ed., New York, 1959, § 28.
20. “Kant’s ‘new method of thought’,” p. 43, also “Kant on the mathematical
method.” The texts are A 717-B 745, A 734-B 762.
21. In “Are logical truths analytic?” Hintikka develops a distinction between
analytic and synthetic according to which some logical truths are synthetic. He
suggests that the logical truths which are analytic according to this criterion are
roughly those which Kant would have regarded as analytic. It follows, however,
that in some of the arguments which according to Beth and Hintikka involve for
Kant an appeal to intuition, the conditional of their premises and conclusion is
analytic. In particular, this is true of the example that Beth works out in detail
in Uber Lockes ‘allgemeines Dreieck’,” § 7. In order to be applied to mathematical
examples like Kant’s, Hintikka’s criterion would have to be extended to languages
containing function symbols. The way of doing this which seems to me most in
the spirit of Hintikka’s definition has some anomalous consequences.
See also “An analysis of analyticity,” “Are logical truths tautologies?” “Kant
vindicated,” and “Kant and the tradition of analysis,” in P. Weingartner (ed.),
Deskription, Analytizitat, und Existenz, Salzburg and Munich, 1966.
22. Beth, op. cit. p. 363.
23. In fact, (1) is analytic according to the criterion of “Are logical truths
analytic?” (see note 21 above). However, according to another criterion which
might be more in the spirit of Kant, to consider as synthetic a conditional whose
proof involves formulae of degree higher than its antecedent, (1) is synthetic. Hintikka
takes account of this in “Are logical truths tautologies?” by making an additional
distinction between analytic and synthetic arguments, such that in the relevant sense
the argument from the conjuncts of the antecedent of (1) as premises to its con¬
sequent as conclusion is synthetic.
24. Cf. Hao Wang, “Process and existence in mathematics,” Y. Bar-Hillel, E. I.
J. Poznanski, M. O. Rabin, A. Robinson (eds.), Essays in the Foundations of
Mathematics, dedicated to A. A. Fraenkel, Jerusalem, 1961, 328-351, p. 335.
25. “Frege’s theory of number,” Max Black fed.), Philosophy in America,
London and Ithaca, 1965, pp. 180-203.
26. An interesting intermediate case is how constructive proofs as the object
of intuitionist mathematics could be interpreted from a Kantian point of view.
According to Kant as I interpret him, certain empirical constructions can function
as paradigms so as to establish necessary truths because of the intention or meaning
associated with them. Intuitionism would require that our insight into these meanings
be sufficient not only to establish laws directly relating to objects in space and
time but also to establish laws concerning the intentions as “mental constructions.” I
leave open the question of whether this is possible from Kant’s point of view or not.
27. Ak. II 272-301.
28. .Die Mathematik betrachtet in ihren Auflosungen, Beweisen, und Folgerungen
das Allgemeine unter den Zeichen in concreto, die Weltweisheit das Allgemeine
durch die Zeichen in abstracto (Erste Betrachtung, § 2. heading, Ak. II 278).
29. Denn da die Zeichen der Mathematik sinliche Erkenntnismittel sind, so kann
man mit derselben Zuversicht, wie man dessen, was man mit Augen sieht, versichert
ist, auch wissen, dass man keinen Begriff aus der Acht gelassen, dass eine jede
einzelne Vergleichung nach leichten Regeln geschehen sei u.s.w. Wobei die Aufmerk-
samkeit dadurch sehr erleichtert wird, dass sie nicht die Sachen selbst in ihrer
allgemeine Vorstellung, sondern die Zeichen in ihrer einzelnen Erkenntnis, die da
sinnlich ist, zu gedenken hat. (Dritte Betrachtung, § 1, Ak II 291).
30. But cf. the following: in der Geometrie, wo die Zeichen mit den bezeichneten
Sachen uberden eine Ahulichkeit haben, ist daher diese Evidenz noch grosser, obgleich
in der Buchstabenrechnung die Gewissheit evenso zuwerlassig ist. (Ibid., 292). . _
31. One might say that it is possible to construct tokens. The sense of possibility
in which this is possible is, however, derivative from the mathematical possibility of
594 4. HISTORICAL STUDIES

constructing types (or mathematical existence of the types). For we declare that
the tokens are possible either directly on the basis of the mathematical construction,
or physically on the basis of a theory in which a mathematical space which is in
some way infinite is an ingredient.
32. Cf. my “Infinity and Kant’s conception of the ‘possibility of experience’,”
Philosophical Review 12> (1964), 173-198.
33. This does not imply that there is an upper limit on the numbers which can
be individually represented, once we admit notations for faster-growing functions
than the successor function. This happens already in Arabic numeral notation. The
number 1,000,000,000,000, if written in ‘0’ and ‘S’ notation with four symbols per
centimeter, would extend from the earth to the moon. That there is such an upper
limit follows, of course, from the assumption that human history must come to an
end after a finite time.
A SOVIET PHILOSOPHER’S VIEW OF
PEIRCE’S PRAGMATISM
Philip P. Wiener

Since philosophy in the Soviet Union is officially subsidized and subject


to the approval of the Communist Party, it is fair to assume that Yuri K.
Melvil’s article in the spring 1966 issue of the Transactions of the Charles
S. Peirce Society1 represents the current Soviet view of American philos¬
ophy. Melvil begins his article by saying that Dewey transformed Peirce’s
pragmatism into the “almost ‘official philosophy’ of America.”2 Then, un¬
willing to consider even the possibility that pluralistic America has no “offi¬
cial philosophy,” Melvil dogmatically asserts that, despite “the decay of
pragmatism as a philosophical teaching,”
Pragmatism, however, has remained and probably will remain for a long time
to come the dominating method of political [Melvil’s emphasis] thinking of the
leading American circles, of the governmental, political and trade-union
leadership. The magazine America, published by the U.S. Government, has
knowingly stated that “American political leaders are inclined to pragmatism.”3
In fact, the magazine America is not a philosophical publication and I
know of no political or trade-union leader who has ever referred to Peirce’s
pragmatism or probably has even heard of it. Many journalists and even
college students and teachers of philosophy rarely, if ever, distinguish
Peirce’s pragmatism (which he purposely labeled “pragmaticism”) from
James’s, Schiller’s, and Dewey’s pragmatisms,4 and Melvil shows no more
knowledge than journalists of the distinctive features of Peirce’s pragmat¬
icism. The many important errors of fact and of interpretation in our Soviet
philosopher’s article unfortunately spoil our appreciation of the serious at¬
tention finally given to Peirce’s philosophy by contemporary Soviet philos¬
ophy: Such errors can only impede mutual understanding between those
American and Soviet philosophers who really desire to know each other’s
ideas rather than to dismiss them without accurate study. Peirce emphasized
the “desire to know” as the common bond of all objective thinkers in science
and philosophy. Hence, it is regrettable that Melvil concludes his article
by reference to “the irreconcilable ideological struggle” in our time as a
ground for dismissing Peirce’s attempt to reconcile science with religious
values. All religion bears the Marxian label “superstition” in Melvil’s ac¬
count.5
Melvil admits that “In Marxist literature Peirce is almost an unknown,”6
595
596 4. HISTORICAL STUDIES

and offers the lame excuse that little was known about Peirce anywhere
during the first third of this century. Students of Peirce, of course, did not
have to wait until the 1930’s to know that William James’s public acknowl¬
edgment of Peirce as the founder of pragmatism was made in 1897.7 Mrs.
Ladd Franklin, one of Peirce’s class of logic students at Johns Hopkins, paid
tribute in 1916 to Peirce’s logic of relations, only two years after Peirce’s
death.8 Morris R. Cohen surely brought Peirce’s philosophy to the attention
of the philosophic public in the early 1920’s,9 as did John Dewey in 1922,
in a special number devoted to American philosophy in the Revue de Meta¬
physique et de Morale.10 It is true that European philosophers knew much
more of William James’s writings than of Peirce’s, perhaps because of the
more graphic style of James’s essays and because James was hailed by
Bergson and others as a pioneer psychologist; the United States was not ex¬
pected to produce any noteworthy philosophers.
By lumping together Peirce’s pragmaticism, James’s radical empiricism,
Schiller’s humanism, and Dewey’s instrumentalism under the label “prag¬
matism,” Melvil asserts the “weakening of the influence of pragmatism” in
logic and philosophy of science because of “the inability of Dewey’s
instrumentalism to deal with logical and methodological problems” of
modern sciences.11 Melvil thus ignores the increasing appreciation and
growing influence of Peirce’s contributions to logic and methodology as
evidenced by the writings of J. Royce, F. P. Ramsey, C. I. Lewis, E. Nagel,
H. Reichenbach, A. Tarski, R. Carnap, and the European, especially Italian
and British, scientific philosophers and analysts who have lately taken a
great interest in Peirce’s ideas. Melvil himself testifies to this fact when he
observes that “Not only in the U.S.A. but also in Canada, England, Switzer¬
land, West Germany, and Italy, voluminous monographs are being pub¬
lished about Peirce’s philosophy.”12
Of course, for Soviet philosophers, the only sound logic and methodology
must be based on dialectical materialism. Hence, Peirce with his remarkable
contributions to the logic and philosophy of science is in Melvil’s view an
“unusual phenomenon in bourgeois philosophy,”13 and can only be ex¬
plained by our Soviet philosopher as “the typical contradiction of a scientist
living in a capitalistic country.”14 Why is there no such contradiction in the
scientific writings of Marx and Engels, Lomonosov and Pavlov, who also
lived in a capitalistic society? Evidently, revolutionary thinkers and pre¬
revolutionary Russian scientists are exempt from the contradictions which
afflict bourgeois scientists. Is it not high time to abandon this dubious dis¬
tinction between two kinds of science and logic and two kinds of truth, bour¬
geois and proletarian, if we are to continue to pursue scientific truth as the
goal of an internationally communal enterprise—that “indefinite commu¬
nity” of scientific investigators proclaimed by Peirce to be the only ultimate
arbiters of truth and definers of objective reality corresponding to such
PHILIP P. WIENER 597

classless truth? Peirce himself was able to indicate the social and political
causes for the historical division between nominalists and realists, e.g., in
the Ockhamite conciliar controversy; Peirce also attacked the “greed-philos¬
ophy” of rugged individualism in nineteenth-century capitalism, e.g., in
Simon Newcomb’s political economy and Charles Sumner’s sociology, with¬
out impugning the classless character of scientific logic and truth.
Yet, to Melvil, the “basic contradiction” of Peirce’s “bourgeois” philos¬
ophy demonstrates the “unscientific nature of pragmatism,”15 principally
because Peirce thought that science was compatible with the love of God.
It would follow that only atheists can be truly scientific thinkers, which
would exclude such historic figures as Descartes, Newton, Leibniz, Spinoza,
Kant, Faraday, Eddington, Einstein, Whitehead, and countless others who
have pursued the sciences without losing their sense of man’s ultimate union
with the universal source of all things and values.
Melvil assumes that science must be materialistic, atheistic, and the sole
repository of absolute truth. Hence, Peirce’s “bourgeois” philosophy of
science is false and contradictory, since Peirce’s pragmatism repudiates
materialism, defends the Christian love of mankind and of God, and under¬
scores the approximative nature of all scientific conclusions in his doctrine
of fallibilism.
Despite Peirce’s severe and penetrating criticisms of Berkeley’s and Mill’s
nominalism, of Mach’s and Pearson’s sensationism, and of James’s psychol¬
ogism, Melvil insists on branding Peirce’s “pragmatism” as “subjective ideal¬
ism.”16 Peirce’s critical review of Berkeley’s subjective idealism17 should
suffice to prove the invalidity of any such characterization of Peirce’s
epistemology. Melvil completely misinterprets Peirce’s view of the relation of
belief to truth by assigning to Peirce the words that Peirce himself puts in
the mouth of an imagined interlocutor, “Mr. Make Believe.”18 For example,
Melvil interprets Peirce’s Monist article, partly written in dialogue form, as
follows:
According to one of the definitions accepted by Peirce, truth is nothing but “a
belief unassailable by doubt” [5.416], or “a final and compulsory belief.” Here
Peirce does not hint at the relationship between truth and objective reality. He
insists upon saying that truth (as well as reality) should be defined only in
terms-of doubt and belief” [5.416].

We need only refer to the whole text of 5.416 in order to see that Peirce is
arguing against defining truth as only a state of belief, and to see that he
attributes this subjective view of truth to “Mr. Make Believe” (who might
very well represent some of the ideas of William James or of F. C. S.
Schiller). Melvil should have noticed the hypothetical “if” clause in 5.416:

If your [Mr. Make Believe’s] terms “truth” and “falsity” are taken in such
senses as to be definable in terms of doubt and belief . . . , in that case you
[Mr. Make Believe] are only talking about doubt and belief. . . . Your problems
598 4. HISTORICAL STUDIES

[Mr. Make Believe] would be greatly simplified, if, instead of saying you want
to know the “Truth,” you were simply to say that you want to attain a state
of belief unassailable by doubt19 [Peirce’s emphasis].

As early as 1878, in his essay “How To Make Our Ideas Clear,” Peirce
distinguished the logical from the psychological aspects of knowing. Al¬
though his preceding essay, “The Fixation of Belief,” was primarily a psy¬
chological and social analysis of methods for attaining stable beliefs, Peirce
never abandoned the logical criteria of the conformity of ideas to objective
reality and the logically independent status of universals or laws. Hence,
Peirce cannot, by any stretch of dialectics, be properly called a “subjective
idealist” or even a “phenomenalist,” as Melvil says of Peirce’s equation of
“cognizability (in its widest sense)” and “being.”20 Peirce plainly distin¬
guished his pragmaticism which is not definable as “thoroughgoing phe¬
nomenalism” from James’s pragmatism which was phenomenalistic in its
emphasis on sensuous immediacy.21 Yet Melvil insists on saying that Peirce’s
pragmatism was based on the same idea of truth which we find elaborated
upon by James, which later served as the pragmatic definition of truth by
preference: truth as “satisfactory,” as something which corresponds to our
goals, as something expedient.22
Referring again to the whole text of the passage in 1.344, quoted by
Melvil as proof of Peirce’s subjective skepticism, we find that Peirce is
actually arguing against the skeptic “of the mendacious, clandestine, dis¬
guised, and conservative variety that is afraid of truth, although truth merely
means the way to attain one’s purpose” (1.344).
Every student of Peirce knows that Peirce in all his epistemological writ¬
ings opposed the skepticism, subjectivism, and nominalism of those who
willfully limited the idea of truth to personal preference or to selfish op¬
portunism instead of relating truth to the ideal goal or purpose of finding
what reality is like when investigated by the whole community of scientific
inquirers, reaching far into the future as well as the past and present.
There are, it is true, obscure passages in Peirce where it is difficult to
follow his line of demarcation between the psychological and the logical.
Murray G. Murphey has traced the development of Peirce’s philosophy
from an early positivistic nominalism toward a metaphysical realism lying
somewhere between this “propepositivism” and “objective idealism.”23
Melvil unhistorically assumes, as Soviet philosophers following Lenin do,
that all idealistic philosophies have to be subjective, despite Marx’s own
early idealism and despite the repudiation of Berkeley’s idealism by Leibniz,
Kant, Hegel, Royce, and Peirce, Melvil also dogmatically asserts that all
religious ideas are based on superstition, although historically the religious
love of mankind and compassion for human suffering—which Peirce (fol¬
lowing Paul Carus) took to be the essence of Christianity, Judaism, and
Buddhism—have these same values in common with Marxian humanists
and their eschatology of the paradise to come with the classless society.
PHILIP P. WIENER 599

Peirce’s fallibilism bothers Melvil since a dialectical materialist regards


the Soviet Marxian philosophy as having established with certainty the
necessity of the laws of nature and of social history. Peirce’s lifelong critique
of “necessitarianism”24 opposes the moral and historical inevitability of in¬
dividual actions and events in both the human and physical worlds. How¬
ever, Melvil is committed to denying the individual any role in history or
nature other than the one strictly prescribed by the laws of dialectical mate¬
rialism. Melvil does not inquire whether quantum physics conforms to his¬
torical or dialectical materialism or to Peirce’s synechistic tychism.
He rejects Peirce’s notion that “science consists of inquiring, not in ‘doc¬
trine’ . . . not knowledge but the pursuit of knowledge,” and draws the
fantastic inference from these phrases, torn out of context, that Peirce “in¬
sists that knowledge is not at all necessary for the real scientist.”25 In his
1868 papers in the Journal of Speculative Philosophy, Peirce insists that
there is no cognition or intuition without a previous cognition. Nowhere does
Peirce regard “the desire to know” as sufficient or knowledge as not neces¬
sary for the scientist; Peirce always depicts the history of science as a cumu¬
lative and evolving body of knowledge in contrast to the static mechanical
notion which we find in Herbert Spencer’s definition of science as “the
organized system of knowledge.” Until very recently Spencer’s Lamarckism
was also acceptable in the Soviet Union in Lysenko’s philosophy of bio¬
logical science.
Whenever Melvil agrees with Peirce, and it is encouraging to find him
sometimes praising Peirce’s scientific and logical contributions, it is be¬
cause “Peirce probably did not suspect that in essence he expressed only
the dialectical conception of scientific cognition and of the roads to its
development.”26 Of course, the dialectical conception, allegedly held by
Peirce, should have been materialistic rather than objectively idealistic,
even though Melvil admits that Peirce indignantly rebelled against material¬
istic positions which he “willy-nilly accepted.”27
Melvil cites Peirce’s view that materialism and idealism in the nineteenth
century “went hand in hand” as deterministic philosophies of causation, but
he erroneously attributes to Peirce a mechanical Laplacean understanding
of determinism and a failure to recognize the universal character of causal
connection. Anyone at all familiar with Peirce’s continual attacks on me¬
chanical necessity and with the development of his categories would know
that Melvil is mistaken on both counts. Peirce nowhere denies the univer¬
sality of causal connections, but he did insert chance as a vera causa in
order to do justice to the variety, spontaneity, and irreducibility of qualities
in nature. Melvil does raise a good critical question when he says:

Despite the fact that it was Peirce himself who denied the existence of in¬
explicable things, he now says that “chance . . . calls for no explanation
[6.612], because it is the expression of the “pure spontaneity of mind, of its
freedom.28
600 4. HISTORICAL STUDIES

Chance or randomness might be a primitive term in the object language of


science and become explicable in the meta-language of universal causal
determinism without contradiction, if I understand Peirce’s philosophy of
chance at all. However, I find that Peirce often uses the idea of absolute
chance in a variety of puzzling ways.
Melvil cannot reconcile one clear meaning of chance in Peirce, viz. the
probability of empirical statements, with Peirce’s confidence in scientific
method, because for Melvil, Peirce should have faced the problem of “how
to unite the principal capability of science to give absolutely certain knowl¬
edge with the recognition of the imperfection of the statements made by
science, of the approximative nature of its theories, of the fallibility of its
conclusions.”29
This formulation of the problem is not as relevant to Peirce’s epistemology
as it is to the Soviet philosophers’ claims of the absolute certainty of dia¬
lectical materialism in the face of the probabilistic nature of physical laws.
All that Peirce claims is the indubitability of some propositions “for the time
being” so that “while holding certain propositions to be each individually
perfectly certain, we may and ought to think it likely that some one of
them, if not more, is false” (5.498). From this correctly quoted passage,30
Melvil draws the illogical conclusion that “Peirce did not have an objective
criterion for establishing truth or for the difference between knowledge and
belief.”

In his theory of cognition there is a lack of scientific understanding of


practice as the social objective, sensual activity of man in which his knowledge
and belief are tested by an immediate confrontation with objective reality (with
material things) which he can change. As an objective idealist Peirce took no
interest in practice. As a pragmatist, i.e., as a subjective idealist, he saw nothing
in it (practice) except sensations and the habits of the subject, except the
satisfaction of his wishes.

Like Duhem and Stallo, Peirce “in his fallibilism . . . appeared to be


moving away from the recognition of the progressive nature of science
toward the statement of the unreliability of the conclusions of science. As a
result of this, fallibilism took on a skeptical, relativistic character.”31
The answers to these false charges can be summarized briefly:

(1) Peirce’s experimentalism shows no lack of understanding of scientific


practice, and his category of “Secondness” provides for the ineluctable
immediate confrontation with objective reality;
(2) Peirce’s objective idealism does not ignore practice, but does not
identify practice with theory, as Melvil would, since Peirce sub¬
ordinates practice to the guiding principles of empirically grounded
and rationally organized theory;
(3) Peirce’s pragmaticism (ignored by Melvil) is not subjective idealism
but, on the contrary, is a powerful critique of it;
PHILIP P. WIENER 601

(4) Peirce always recognized the progressive evolutionary character of


science, and never regards the relativistic approximative nature of
scientific conclusions a logical ground for skepticism about the utter
reliability of the scientific enterprise of inquiry.

Surely, Melvil has not demonstrated his drastic conclusions: “To the
extent that Peirce was a pragmatist his views did not contain an ounce of
scientific character.”32 I hope I have shown that there is at least need for
more study and international collaboration toward a better understanding of
Peirce’s scientific pragmaticism as a distinct variety of pragmatism, subject
to criticism via the logical canons of scientific inquiry rather than by the
methods of tenacity, political authority, or a priori dialectics.

NOTES
1. “The Conflict of Science and Religion in Charles Peirce’s Philosophy,” Trans¬
actions of the Charles S. Peirce Society, Vol. II, No. 1, pp. 33-50.
2. Ibid., p. 33.
3. Ibid., pp. 33-34. Melvil’s emphasis on “political” is typical of the Soviet sub¬
ordination of philosophy to politics.
4. Cf. A. O. Lovejoy, “The Thirteen Pragmatisms,” Journal of Philosophy, Vol.
V (1908), pp. 1-12, 28-39.
5. Transactions, loc. cit., p. 50.
6. Ibid., p. 35.
7. William James’s Howison Lecture at Berkeley, “Philosophical Conceptions and
Practical Results” (1897), University of California Chronicle (1898) which also long
antedates Soviet Marxists’ emphasis today on praxis in philosophy.
8. Journal of Philosophy, 13 (Dec. 21, 1916), pp. 715-722.
9. Cf. M. R. Cohen’s article “A Brief Sketch of the Later Philosophy,” Cam¬
bridge History of American Literature, Vol. Ill (1921), pp. 226—265; Chance, Love
and Logic, edited by M. R. Cohen in 1923, was the first collection of Peirce’s essays
on the logic and philosophy of science.
10. Vol. XXIX (Oct. 1922), pp. 411-430.
11. Transactions, loc. cit., p. 33.
12. Ibid., p. 34.
13. Ibid., p. 35.
14. Ibid.
15. Ibid., p. 41.
16. Ibid., p. 36. . .
17. North American Review, Vol. 93 (Oct. 1871), pp. 449-472, reprinted in my
edited collection of Peirce’s writings, Values in a Universe of Chance, from which
Melvin quotes half of his footnotes, so that he should have known Peirce’s refutation

18. C. A. Peirce, “What Pragmatism Is,” Monist, Vol. 15 (1905), pp. 161-181.
19. Values, loc. cit., p. 189.
20. Transactions, loc. cit., p. 45.
21. Values, loc. cit., p. 196.
22. Transactions, loc. cit., p. 41. , ... , IT .
23. Murray G. Murphey, The Development of Peirces Philosophy (Harvard Uni¬
versity Press, 1961), 98 ff. 10n->\ ni im
24. C. A. Peirce, “The Doctrine of Necessity,” Monist (April 1892), pp. 321-337.
25. Transactions, loc. cit., p. 38.
26. Ibid., p. 39.
27. Ibid., p. 40.
28. Ibid., p. 44.
29. Ibid., p. 46.
30. Ibid., p. 46.
31. Ibid., pp. 46-47.
32. Ibid., p. 49.
JOHN D. ROCKEFELLER AND THE
HISTORIANS
Sigmund Diamond

Buried in the midst of an extended discussion by Ernest Nagel of the


problem of “subjectivity” in historical interpretation is a brief footnote
which in the process of disposing of one issue raises another—one highly
suggestive for exploring the relations between historical knowledge, on the
one hand, and present behavior, on the other. Nagel says:

Deliberate falsification in the interest of some favored cause is today rare among
professional historians in nonauthoritarian countries, and uncritical acceptance
of demonstrably erroneous or inadequately supported statements of alleged
fact is not the most frequent way in which historians reveal their partisan al¬
legiances. Indeed, it is not inconceivable that each of two historical accounts of
the same period could contain only indisputably correct statements about mat¬
ters of particular (or “simple”) fact but that each would nevertheless be marked
by a distinctive bias. For two such accounts might differ in what they mention or
fail to mention, in the way they juxtapose the events both of them report, or
in the emphases they place upon various factors both admit were operative; and,
in consequence, one of the accounts might in effect be an argument for a con¬
ception of the goals and limits of human endeavor in opposition to the concep¬
tion defended by the other account.1

That present events affect interpretations of history is a cliche that those


who deny the possibility of historical objectivity take delight in—too much
delight, as Nagel reminds us. But the other side of the coin—how one’s
interpretation of the past affects his assessment of his present situation and,
therefore, his behavior in it—has by no means been remarked upon as
frequently. If, to revert to Nagel’s footnote, one historical account “might
in effect be an argument for a conception of the goals and limits of human
endeavor in opposition to the conception defended by” some other account,
then belief in the version of history exemplified in the first account will
result in a definition of the present situation different from that which
would derive from some other historical interpretation, and flowing from
different assessments and definitions of the present situation would be
different behaviors.
Let us pursue briefly the clue that Nagel has offered us, first, to docu¬
ment the changing reputation of a career that has been the subject of much
historical revisionism in recent years, and, second, to suggest some possible
602
SIGMUND DIAMOND 603

relations between different evaluations of the meaning of that career and


different assessments of our present condition.
About thirty years ago, the New York Times printed an article deploring
the underemphasis on the significance of business in American history
textbooks and courses. And so one author—preparing to write a biography
of John D. Rockefeller—distributed a questionnaire among a large number
of college juniors and seniors. The results are interesting to look at. In this
day of increasing sophistication with public opinion polls, we have all be¬
come familiar with the category of “Don’t Know,” and the interpreters of
the results of the polls have to show considerable ingenuity in figuring out
what the “Don’t Knows” really mean. But there were no “Don’t Knows”
about Rockefeller among the college students of thirty years ago. Everybody
had an opinion, everybody knew why—or at least had a reason why—he
was important in American history. For some, it was because he was the
father of the Standard Oil Company; for others, because he was the father
of organized philanthropy; for still others, because he was the father of the
trust movement; and for a surprisingly large number—who must have
been from the Ivy League—because he was the father of John D.
Rockefeller, Jr.2
Many of the evaluations about Rockefeller made by the college students
of thirty years ago were, of course, contradictory—so contradictory, in¬
deed, that it seems almost as if the students were judging several different
people instead of one man of several reputations. But what emerges most
clearly from their answers is the apparent effect on their own beliefs and
behavior of whatever evaluation of Rockefeller they preferred to make.
For the vast majority of them, John D. Rockefeller had been a con¬
structive force in American life; and he was, therefore, a figure to be
admired, a person to be emulated. His “is the shining example of a poor
boy rising to the zenith; it serves me as a stimulus”; “he teaches me never
to be discouraged”; “his life shows that there are great rewards for
hardworking youth”; “he is an impetus for inspiring me”; “I rise early,
work hard, and save diligently, and his life makes me optimistic for the
future”3—it was in these very words that the students of thirty years ago
revealed how they had evaluated Rockefeller, how they had internalized
the qualities they deemed had been demonstrated by Rockefeller, and how
the judgment they made about Rockefeller affected their own behavior.
What I hope emerges from this too brief introduction to a very com¬
plicated problem is the fact that the importance of making a judgment
or evaluation of some historical personality goes far beyond doing justice
to him, important though that is. But more to the point, the way in which
we assimilate our history, the judgments and evaluations we make of our
own past, affect the attitudes we have and the actions we take in defining
the problems we feel called upon to solve in the present.
604 4. HISTORICAL STUDIES

If, in the course of analyzing the factors that must be taken into account
in evaluating Rockefeller, we should find—as indeed we will—that the
evaluations and assessments of the man have changed with the passage of
time, that many of the factors which must be considered have not always
received their proper attention, we shall have pointed to the existence of
an interesting, and even important, problem in American history. Why have
evaluations of John D. Rockefeller differed so markedly even at the same
moment in time, and why have they changed over time? And what may
be learned about American society from the fact that, with the passage
of time, we are now all asked to believe that he was “really” the kind of
man revealed to us in the most recent works of historical scholarship?
In 1900, Oscar Lovell Triggs, a professor of English literature at the
University of Chicago, made a speech in which he said that John D.
Rockefeller was superior to William Shakespeare in respect of his value
to humanity. Charles E. Perkins, president of the Chicago, Burlington,
and Quincy Railroad, who had just made an address in which he claimed
that Marshall Field was a greater man than Shakespeare, immediately sat
down and wrote a letter to Professor Triggs, commenting on the Chicago
Tribune editorial disagreeing with the professor: “The Tribune says
Shakespeare goes on forever while Rockefeller is soon forgotten,” he wrote.
“The influence of the things Rockefeller does goes on forever also. The
world wants both of course. But Rockefeller is much more a necessity than
Shakespeare.”4
If the American people as a whole had agreed with Professor Triggs
and with railroad president Perkins in the way in which they evaluated
Rockefeller, then Rockefeller would not today be studied as a “problem”
in American history courses, and—more important—it would not have been
necessary for him to write, as he did write in 1907:

A great responsibility rests upon newspapers. There is an ethical duty they


owe to the community which I sometimes feel they regard too lightly. These
stories about my wealth, for instance. They have a bad effect on a class of
people with whom it is becoming difficult to deal, and when I say this I am
not forgetting that a great responsibility rests upon wealth. But the stress which
is laid in those stories arouses hatred and fear not against the individual only
but against organized society. It is the general trend I have in mind, which
is arraying the masses against the classes.5

What Rockefeller would have liked the American people to believe of


him, how he wanted them to evaluate him, is perhaps best summed up
in one short paragraph from his own autobiography:

If I were to give advice to a young man starting out in life, I should say to
him. If you aim for a large, broad-gauged success, do not begin your business
career . . . with the idea of getting from the world by hook or crook all you
can. ... Let your first thought be: Where can I fit in so that I may be most
effective in the work of the world? . . . Enter life in such a spirit_The man
will be most successful who confers the greatest service on the world.6
SIGMUND DIAMOND 605

Rockefeller’s statement about himself becomes, in the most recent


authoritative biography of him, the historian’s judgment about him as well.
But that others felt differently at least at one time may be inferred from
the fact that the same quotation from Rockefeller’s autobiography, cited so
approvingly by Professor Nevins, appears in Upton Sinclair’s 1915
anthology, The Cry for Justice, in the section entitled Humor.7
If Rockefeller had indeed acted in the way and for the reasons he im¬
plied in his autobiography, then there could hardly be any valid reason for
dissenting from his own judgment about himself or from that of his most
recent biographer that his major qualities were “organizing genius,
keenness of mind, tenacity of purpose, and firmness of character.”8
But we know that dissent there was and dissent there still is; and so it
becomes important to inquire into the factors which have led to such
disparate judgments about the significance of this man in American history.
Judging from the way in which even the most sober newspapers have
kept box scores of the number of times that a Jimmy Hoff a or a Dave
Beck or an alleged communist has invoked the fifth amendment against
self-incrimination or given an evasive reply to an investigating committee,
we would have to conclude that the American people are exceedingly
righteous, that the esteem, or lack of it, in which they hold their fellow man
is based in large part on the degree to which that person strikes them
as honest and candid.
It may be instructive, therefore, in our effort to inquire into the factors
that have led to such different evaluations of the significance of Rockefeller
and the Standard Oil Company, to turn the clock back to an earlier
period. The year is 1879, the place Pennsylvania, and four officials of the
Standard Oil Company are indicted for conspiracy, among other things,
to drive competitors out of the oil industry. At the same time they are
summoned before the court to give testimony in another case in which their
competitors are seeking to enjoin them from acting in concert against them.
At first, they refuse to appear even in response to a subpoena; when they
do, they refuse to testify on the grounds of self-incrimination.9
The year is 1880, the place New York, and officials of the Standard
Oil Company are called before the Hepburn Commission of the New York
State Assembly. “How can I,” says Jabez Bostwick, one of those facing
indictment in Pennsylvania, “a man soon to be tried for conspiracy, be
expected to answer these questions? I shall incriminate myself. ... I refuse
to answer, lest I incriminate myself.”10
Constant repetition of the same situation forces the Hepburn Commis¬
sion to report that the Standard Oil Company is “a mysterious organization
whose business and transactions are of such a character that its members
decline giving a history or description of it lest this testimony be used to
convict them of a crime.”11
The time is 1876, the place is Washington, and Oliver H. Payne, treasurer
606 4. HISTORICAL STUDIES

of the Standard Oil Company, is called before the House Committee on


Commerce to testify regarding the first interstate commerce bill. He
refuses to answer questions, “on advice of counsel”—an answer, in¬
cidentally, which later congressional committees were not to recognize as a
sufficient basis for refusal to testify.12
Or, as an example of evasiveness, consider the testimony of John D.
Rockefeller himself before the New York Assembly:

Q. Was there a Southern Improvement Company?


A. I have heard of such a company.
Q. Were you not in it?
A. I was not.13

Rockefeller was, of course, perfectly right in answering as he did. He


had been a member of the South Improvement Company, not the Southern
Improvement Company.
Or consider, as another example of his evasiveness, his testimony before
the New York Assembly in 1888. He could not remember the capitaliza¬
tion of Standard Oil of New York; he could not remember the highest
price that had been paid for Standard Oil stock; he did not know of any
agreements between Standard Oil and any railroad companies; he did not
know the rates charged by his pipeline companies; he could not remember
the name of a single competitor of Standard Oil; he did not know the
profits of his company or the percentage of Standard Oil capacity that was
idle; he was not even sure that there were bylaws for the Standard Oil
Company or, if there were, where they might be found.14
Or, finally, consider the sworn affidavit of Rockefeller that he submitted
to the Ohio legislature in 1880: “It is not true . . . that the Standard Oil
Company directly or indirectly through its officers or agents owns or
controls the works of Warden, Frew and Company, C. Pratt and Company,
Lockhart, Frew and Company. ... It is not true that the Standard Oil
Company directly or indirectly purchased or acquired the Empire Trans¬
portation Company or furnished the money therefore.” Long before that
affidavit was written in 1880, the works of Pratt, Lockhart, and Warden
had been bought and paid for in Standard Oil stock, and on October 17,
1877—three years before the affidavit—Standard Oil had paid $2,500,000
in certified checks toward the purchase of the refinery of the Empire
Transportation Company.15
The significance of these events should not be misunderstood. It was not
wrong for Rockefeller and his associates to have refused to testify on the
grounds of self-incnmination, nor should any inferences be drawn as to
their guilt. These episodes are cited only because, though they were con¬
sidered by Rockefeller’s contemporaries as factors that had to be taken
into account in explaining his position in American society, in our time
they are less often considered as important in the evaluation of his career.
SIGMUND DIAMOND 607

With respect to other issues as well, it would be possible to show that con¬
siderable discrepancies exist between the image of Rockefeller on the basis
of which we are asked to cast a favorable verdict on his role in American
history and the image which led many of his contemporaries to believe that
his life was not an unmixed blessing to the American people. These differ¬
ences in evaluation, it should be pointed out, do not all arise from the fact
that two observers of the same facts may come to different conclusions be¬
cause they interpret the facts differently and that all we need do is pay our
money and take our choice between the different interpretations. On the
contrary, many of the factors on which the evaluation of Rockefeller must
be based are not matters of interpretation at all; they are findings of fact,
and not even an historian’s findings of fact but judicial determination of fact.
It is true, of course, that there are certain points at which the evidence
is susceptible of different interpretations. Professor Nevins, for example,
noting that Rockefeller did not serve in the Civil War and that his claim
that he had sent thirty men into the Union Army could not possibly be
true because his own journal shows contributions of only $138.08 during
the war years, dismisses the matter as follows: “After all, no compelling
moral reason existed why he should enlist. . . .”16 But for journalist Cal
Tinney, the significance of the same set of facts is quite different: “He was
smart. Complete proof he was smart showed up when he was twenty-two,
at which time he refused to join up for the Civil War and, taking seventy-
five more years to die, got rich. Was ever better evidence offered that
Pacifism Pays?”17
Professor Nevins, to take another example, noting that Rockefeller kept
his wife fully informed about his plans regarding the South Improvement
Company, concludes that “this alone goes far to prove that he was con¬
vinced he was following a proper course, for Mrs. Rockefeller was a highly
religious woman with stern moral standards.”18 But it would not, I think,
cause much dismay if another historian were to interpret the same fact
differently.
But, as I have said, many of the factors that must be taken into
account in casting up the Rockefeller balance sheet are not, properly
speaking, matters of inference or interpretation at all. It is not, for example,
a matter of interpretation that the Standard Oil Company practiced indus¬
trial espionage. “Wilkerson and Co. received car of oil Monday 13th,”
wrote a Standard Oil official. “70 barrels which we suspect slipped through
at the usual fifth class rate—in fact you might say that we know it did—
paying only $41.50 freight from here. Charges $57.40. Please turn another
screw.”19
It is not a matter of interpretation that the Standard Oil Company gave
$350,000 to the editor of the Denver Post to induce him to cease his
opposition to the natural gas rates that had been established for that city.20
It is not a matter of interpretation that the Standard Oil Company sub-
608 4. HISTORICAL STUDIES

sidized newspapers from Pennsylvania to Colorado in an effort to influence


their editorial policy, or that George Gunton, an erstwhile labor leader,
published his magazine with secret Standard Oil subsidies.21
It is not a matter of interpretation that Standard Oil officials and United
States Senator John D. Archbold paid money to members of Congress to
influence legislation.22
Nor, finally, is it a matter of interpretation that the Standard Oil Com¬
pany engaged in discriminatory practices, practiced price cutting to drive
competitors out of business, operated bogus oil companies, engaged in
industrial espionage. Not only are these not matters of interpretation; they
are findings of fact, judicially determined by the Supreme Court at the
time of its great decision.23
There was a time when all this was denied, when the American people
were asked to evaluate and assess Rockefeller and his role in American
society as if these things simply had not happened. But that is no longer
true; no sophisticated defender of Standard Oil really denies them any
longer. Instead, we are asked above all to keep two things in mind in
judging Rockefeller’s contribution to the development of American in¬
dustry: first, that we must not judge him by the standards of our time but
by those of his own time, and in his own time others were doing the
same things that he was charged with doing; second, conceding even that
his methods were open to criticism, that he captured a vision of combina¬
tion and order in American industry and, despite great opposition, suc¬
ceeded in transforming an economy characterized by the existence of small,
wasteful, competitive units into an economy of efficient, large-scale enter¬
prise.
The first of these elements that go into the new evaluation of Rockefeller
is not, I think, particularly serious. It is no doubt true that many other busi¬
nessmen were doing what Rockefeller did, but that fact is not in itself evi¬
dence that the social norms of the period were so lax, so flexible that
anybody could do anything he pleased in business. If that were the case, then
how could we explain the roar of disapproval that was aroused by his activ¬
ities, disapproval that was far greater in his own day than in ours, when
presumably the standards governing business behavior are even stricter?
Moreover, Rockefeller was charged and found guilty not of violating the
expectations that people had regarding proper behavior; he was found
guilty of law-breaking. It can hardly be documented that law-breaking by
businessmen has ever been accepted as a norm of American society, except
perhaps by those who engage in the practice.
And, finally, the argument that others did it too was used by Rocke¬
feller in his own defense and was rejected by many of his contemporaries.
He admitted some wrongdoing, said the Supreme Court, but “laid it to too
great individual zeal in the keen rivalries of business or of the methods
and habits of dealing which, even if wrong, were commonly practiced at the
SIGMUND DIAMOND 609

time.”24 After taking note of the argument, the Supreme Court rejected
it. Shared guilt, Rockefeller implied, is not guilt at all; shared guilt, said
the Court, is guilt nonetheless.
But the second argument is a more serious one and deserves greater con¬
sideration. Let us, therefore, turn to the classic statement of the argument
as it appears in Professor Nevins’ biography of Rockefeller:

He early caught a vision of combination and order in an industry bloated,


lawless, and chaotic. Pursuing this vision, he devised a scheme of industrial
organization which, magnificent in its symmetry and strength, worldwide in its
scope, possessed a striking novelty. The opposition which he met was massive
and implacable. . . . He and his partners marched from investigation to in¬
vestigation, from suit to suit, under a growing load of approbrium. But they
moved imperturbably forward. They believed that the opposition was mistaken
and irrational. In their opinion it represented a wasteful anarchy; the full
victory of this competitive laissez-faire individualism would mean retrogression,
confusion and general loss. . . . The struggle against Rockefeller's movement
for industrial consolidation was not a struggle against criminality: It was largely
a struggle against destiny.25

The historian is asking us to believe that Rockefeller’s contribution


to American industry and American society was that he placed himself at
the head of the movement for combination and rationalization, that the
issue posed by his career was whether the United States would be shackled
by the wastefulness and inefficiency of laissez-faire individualism or
would move forward into a new era of economic rationality, that those who
opposed him were struggling against destiny.
Even if we were to deny one of Professor Nevins’ assumptions, an as¬
sumption unpardonable in an historian—that the outcome of an event is
the same as the motives of those who precipitated it—what remains still
constitutes a telling argument. If there is such a thing as a wave of the
future and if Rockefeller represented that wave of the future in his own time,
then he would be deserving of the evaluation Professor Nevins makes of
him and the very best that could be said of his opponents would be that
they were foolish and misguided. But—and this I think is the critical point
—Professor Nevins’ evaluation of Rockefeller depends on the historical
validity of what he tells us was the issue that faced the American people
in the late nineteenth and early twentieth centuries—the choice, as he says,
between anarchic laissez-faire individualism and the efficiency of large-
scale organization.
But suppose that was not really the issue; suppose those were not really
the alternatives; suppose, even, that many of Rockefeller’s opponents were
as much opposed to laissez-faire as he was. What, then, would become of
Professor Nevins’ evaluation of Rockefeller?
It becomes, then, a matter of the greatest importance to see if Professor
Nevins’ alternatives were the ones that really faced the American people, to
see if it is true, as he says, that Rockefeller’s life posed the issue of laissez-
610 4. HISTORICAL STUDIES

faire anarchy versus large-scale efficiency. And so, very briefly, we must
take a look at the texts.
Standard Oil “has known, not guessed at conditions,” wrote Ida Tarbell.
“It has had a keen authoritative sight. It has applied itself to its tasks with
indefatigable zeal. It has been as courageous as it has been cautious.” So
far Miss Tarbell sounds like Professor Nevins. But now note the difference:

These qualities alone would have made a great business, and unquestionably
it would have been along the line of combination, for when Mr. Rockefeller
undertook to work out the good of the oil business the tendency to combination
was marked throughout the industry, but it would not have been the combination
whose history we have traced. To the help of these qualities Mr. Rockefeller
proposed to bring the peculiar aids of the South Improvement Company.26

In short, it appears that Miss Tarbell, Rockefeller’s enemy, agrees with


Professor Nevins, Rockefeller’s defender, that combination was on the way;
and it appears even further that what she opposed were not combinations in
general but the characteristics of this particular one.
Suppose, now, that we examine that other great antagonist of Standard
Oil, Henry Demarest Lloyd, to see if he conforms to Professor Nevins’
notion of Rockefeller’s opponents as being in favor of an outmoded, in¬
efficient, laissez-faire individualism. Lloyd wrote:

It is one of the paradoxes of public opinion that the people of America, least
tolerant of this theory of anarchy in political government, lead in practising it
in industry. Politically, we are civilized; industrially, not yet. Our century, given
to this laissez-faire . . . has done one good: it has put society at the mercy
of its own ideals, and has produced an actual anarchy in industry which is
horrifying us into a change of doctrines.27

So far this sounds like Professor Nevins talking about the opponents of
Standard Oil, but it is Lloyd talking about Standard Oil.
If these opponents of Rockefeller were as much opposed to laissez-faire
as he was (and as Professor Nevins is), and if they, too, like Rockefeller,
regarded combination as inevitable, why then did they oppose him? “In cast¬
ing about for the cause of our industrial evils,” Lloyd said,

public opinion has successively found it in competition, combination, the


corporations, conspiracies, trusts. But competition has ended in combination,
and our new wealth takes as it chooses the form of corporation or trust, or
corporation again, and with every change grows larger and worse. Under these
kaleidoscopic masks we begin at last to see progressing to its terminus a steady
consolidation, the end of which is one-man power. The conspiracy ends in one,
and one. cannot conspire with himself. When this solidification of many into
one has been reached, we shall at last be face to face with the naked truth that
it is not only the form but the fact of arbitrary power, of control without
consent, of rule without representation that concerns us.28

It is clear, then, that to these opponents of Rockefeller, at least, the choice


was not between laissez-faire and consolidation. What, then, were the al¬
ternatives?
SIGMUND DIAMOND 611

“The new self-interest,” Lloyd wrote (the identification of the individual


with society), “will remain unenforced in business until we invent the forms
by which the vast multitudes who have been gathered together in modem
production can organize themselves into a people there as in government.”29
These statements do not of course constitute proof, but they strongly sug¬
gest that the alternatives on the basis of which the current evaluation and
assessment of Rockefeller are reached were not the alternatives that his con¬
temporaries really faced. For them, his life did not pose the choice of
individualism versus consolidation. It posed the problem of whether con¬
solidation, which they recognized as indeed inevitable, would be left un¬
controlled or would be regulated in the interests of the public.
If the meaning of Rockefeller’s career is not the triumph of consolidation
over laissez-faire but the still-not-resolved problem of in whose interests
this consolidation shall be accomplished, and if, as we have seen, so many
discrepancies exist between reality and the image of Rockefeller presented
to us, we must ask ourselves why that image has the form and content that
it does. The answer is, I think, that in these works we are not really given
an analysis of the behavior of a man; we are instead presented with a
symbol. And as with all symbols, the ideas that are associated with the name
of the symbol have no necessary connection with the reality behind the
name. Rockefeller himself admitted that “ever since I was a young man, I
have had a daily nap, sometimes two”; but he is presented by the image-
makers as a man whose very relaxation was only variety of labor. He once
admitted to B. C. Forbes, the business publicist, that while people persisted
“in thinking that I was a tremendous worker, always at it early and late,
summer and winter,” the “real truth is that I was what would now be called
a ‘slacker’ after I reached my middle thirties. ... I never, from the time
I first entered any office, let business engross all my time and attention.”30
But for the symbol-makers, such protestations—even when they came from
Rockefeller himself—were of little significance. The makers of symbols are
never interested in asking of their subject the question, What are you really?
Instead, they ask, What can you make of others? For it is the function of a
symbol to stimulate the voluntary acceptance of the same values, the same
behavior, the same discipline by the viewer as are allegedly incarnated in
the man being symbolized.
Why, then, do the symbol-makers seek to have people internalize the
values and standards that they associate with the name of Rockefeller? It
must not be forgotten that the United States of the twentieth century is not
the United States in which Rockefeller flourished. The same newspapers
that announced the death of Rockefeller in 1937 announced also that the
United States Supreme Court had that day upheld the constitutionality of
the Social Security Act; and on the day after his death, the Northern Baptist
Convention, with which he had so long been associated, passed one resolu¬
tion mourning his death and another saying that The building of the King-
612 4. HISTORICAL STUDIES

dom of God on earth lays upon . . . individual Christians the compulsion


to work for a living minimum wage and a maximum income, set by law,
so as to make possible the minimum wage.”31 The environment in which
business had to operate in the mid-1900’s was not nearly so secure as it
had been during the heyday of Rockefeller, and if it were not to deteriorate
still further, the values of people—and therefore their actions—would have
to be altered.
Lest I be accused of having given the impression here that there is a
kind of conspiracy in which the agents of the Standard Oil Company sur¬
reptitiously succeed in recruiting the molders of public opinion into their
own service, let me hasten to say that such an impression was inadvertent
and that I do not believe it myself. My own feeling about the matter is
perhaps best summarized in a short poem (by Humbert Wolfe) about Brit¬
ish newspaper men, which has, however, a wider applicability:

You cannot hope to bribe or twist,


Thank God, the British journalist.
But, seeing what the man will do
Unbribed, there’s no occasion to.32

As to the effect of the change that has taken place in the evaluation of
John D. Rockefeller from his time to our own, let me say only this. In
1922, long before it became popular, at least in intellectual circles, to
inveigh against conformity, John Dewey pointed to an important character¬
istic of the American scene:

Our forbears who permitted the growth of legal and economic arrangements at
least supposed, however mistakenly, that the institutions they favored would
develop personal and moral individuality. It was reserved for our own day to
combine under the name of individualism, laudation of selfish energy in in¬
dustrial accomplishment with insistence upon uniformity and conformity of
mind.33

Not the least of the problems posed by Rockefeller’s life—and the ways in
which that life has been evaluated differently over time—is whether it is
possible to encourage what Dewey called “selfish energy in industrial ac¬
complishment” without at the same time encouraging the conformity that
we all say we deplore.

NOTES
1. Ernest Nagel, The Structure of Science (New York and Burlingame, 1961),
580—81, note 25.
SS^LyWilliam H. Allen, Rockefeller: Giant, Dwarf and Symbol (New York, 1930),

3. Ibid.
4. The episode is discussed in Edward Chase Kirkland, Dream and Thought in
the Business Community, 1860-1900 (Ithaca, New York, 1956), 166.
5. Quoted in Chicago Herald and Examiner, May 28, 1937!
SIGMUND DIAMOND 613

6. John D. Rockefeller, Random Reminiscences of Men and Events (New York,


1909), 143.
7. The authoritative biography is, of course, Allan Nevins, John D. Rockefeller:
The Heroic Age of American Enterprise (New York, 1940). The quotation appears
on page 455 of the 1963 reprinting of Sinclair’s anthology.
8. Nevins, II, 714.
9. Matthew Josephson, The Robber Barons (New York, 1934), 274—75.
10. Ida M. Tarbell, The History of the Standard Oil Company (New York, 1925),
I, 243.
11. Ibid., I, 231.
12. Ibid., I, 169.
13. Ibid., II, 132; Josephson, 275.
14. Allen, 572.
15. Tarbell, I, 230-31.
16. Nevins, I, 140.
17. New York Post, May 27, 1937.
18. Nevins, I, 336.
19. Josephson, 268. See facsimile of letter in Tarbell, II, 45.
20. Ferdinand Lundberg, America’s 60 Families (New York, 1937), 250-51. See
also Nevins, II, 517.
21. Lundberg, 247-51.
22. Nevins, II, 505-15.
23. 221 U.S. I; 31 Supreme Court Reporter, 502.
24. Quoted in Allen, 33.
25. Nevins, II, 709. The entire paragraph deserves scrutiny as an example of the
way in which a sense of conviction as to the correctness of the argument derives from
the literary skill with which it is put together rather than from the character of the
evidence and the validity of the inferences drawn from the evidence. The imperturb¬
ability of Rockefeller and his associates is doubtless supposed to conjure up a vision
of calm serenity, of supreme confidence in the rightness of a certain course of action
even in the teeth of savage opposition. The picture is an appealing one, but (a) there
is no positive evidence that Rockefeller and his associates were indeed imperturbable,
and (b) Rockefeller’s increasing concern with changing public opinion through direct
and indirect influence on the press, through philanthropy, and through new pro¬
grams in industrial relations might be construed as evidence of a lack of imperturb¬
ability.
26. Tarbell, II, 274.
27. Henry Demarest Lloyd, Wealth Against Commonwealth (New York, 1894),
496.
28. Ibid., 511.
29. Ibid., 523.
30. Bertie Charles Forbes, Men Who Are Making America (New York, 1921),
299.
31. Baltimore Sun, May 24, 1937; Newark Evening News, May 25, 1937.
32. Humbert Wolfe, The Uncelestial City (New York, 1930), 30-31.
33. John Dewey, “Mediocrity and Individuality,” New Republic, XXXIII (1922),
35.
DATE DUE

GAYLORD
PRINTED IN U.S.A.
MARYGROUE COLLEGE LIBRARY
Philosophy, science, and metho
501 P54

3 1537 □DD03455 3

501
P54 Philosophy, science and
method; ed. by Morgen-
besser and others

You might also like