Divine Authorship? Responses to Letters to the Editor

Follow-up to "Divine Authorship? Computer Reveals Startling Word Patterns": Responses by Dr. Jeffrey Satinover to Letters to the Editor of Bible Review
to Letters to the Editor of Bible Review

©1996 Jeffrey B. Satinover
Reprinted from Bible Review, Volume X, Number 1, February 1996

[The February, 1996 issue of Bible Review contained many letters commenting on Dr. Jeffrey Satinover's article on the "Codes in Torah" (ELS patterns), "Divine Authorship?", which had been published in the October 1995 issue. We reprint here Dr. Satinover's response to these correspondents. Note that their actual letters are not reprinted here, nor are correspondents' names. These are the only edits made to Dr. Satinover's response as published in Bible Review.]

The robustness of the Torah codes findings derives from the rigor of the research. To be published in a journal such as Statistical Science, it had to run, without stumbling, an unusually long gauntlet manned by some of the world's most eminent statisticians. The results were thus triply unusual: in the extraordinariness of what was found; in the strict scrutiny the findings had to hold up under; and in the unusually small odds (less than 1 in 62,500) that they were due to chance. Other amazing claims about the Bible, Shakespeare, etc., have never even remotely approached this kind of rigor, and have therefore never come at all close to publication in a peer-reviewed hard-science venue. The editor of Statistical Science, himself a skeptic, has challenged readers to find a flaw; though many have tried, none has succeeded. All the "first crack" questions asked by BR readers – and many more sophisticated ones – have therefore already been asked by professional critics and exhaustively answered by the research. Complete and convincing responses to even these initial criticisms can get fairly technical. I can here clarify only up to a point some of the misunderstandings in the letters. But I will also send more complete private responses to each writer whose comments are not fully answered here.

A number of writers think it a weakness that an event must occur before it can be found in the text. This seems to me like watching an elephant fly backward through time only to complain, "But why isn't it traveling into the future, too?" What the research doesn't do in no way detracts from what it does. Nonetheless, there are ways to evade this constraint, as [one letter-writer] suggests, though there are difficulties. Verifiable results depend upon a sufficiently large and uniform data set, which is not easy to assemble. Isolated examples can rarely generate statistically valid results: Looking for one's own name, or for Jesus', is therefore an uninterpretable enterprise, even if you find what you're looking for.

If the research is eventually disproved, no one serious will bother using the text of Genesis as an oracle. But should the body of rigorous findings expand, some will even so be inclined to ignore the primary meaning of that same text – including its cautions. The Torah itself casts this most typical of human inclinations at the center of its most distressing dramas – e.g., the garden, the snake, "ye shall be as gods," etc. – and it suggests that what follows is more consequential than the mere "demerits" of scouting, pace [another letter-writer]. [A third correspondent] quickly identifies a major conflict between the higher-critical (more generally modernist) worldview and the implications of the codes research. He reminds us that there is a huge mass of Bible-related data that is convincingly synthesized in line with the conventionally anti-supernatural assumptions of modern scholarship. [The third writer's] chief objection – and one of [a fourth writer] – is to what the codes research appears to imply concerning the integrity of the Masoretic Genesis, that is, that what we have today is the 100-percent error-free replica of the original. But, as will be explained below, he has not found the flaw in the research he thinks he has (and that he seems to have felt he had better find quickly!). Neither is the integrity of the text so evident.

In a similar vein, [the fourth letter-writer] recognizes that no text can be both densely encoded and at the same time tell a previously established story. Rather, no human being could. [This writer] also correctly notes that researchers with a computer can just "keep trying until it does get lucky" so as to "yield any damn-fool message." He thus pointedly raises the important question of unreported "hidden failures," a well-known bugaboo in all scientific research. To exclude this possibility, the referees insisted that the researchers repeat their positive findings on a separate set of individuals selected according to the referees' own criteria. The paper reports on the results for this second set alone.

Related to the question of unreported failures is the potential for hoaxes. Given the many frauds science has unfortunately fallen prey to, especially of late, I'd better play the straight man: I am not joking and I have no reason to suspect anyone else is. To rule this out, the referees were given the computer programs to do with as they pleased. Furthermore, the data – the text of Genesis, the names and dates of the individuals – have long been published in public-domain texts. They cannot be faked. Anyone may start from scratch and arrive at the same results, as has been done repeatedly. Nonetheless, perhaps one day I will be shown to have been a fool in this matter and, along with others, a boyish one at that (my translation of "puerile") – but neither a liar nor practical joker. As for simple chance explaining it (in contrast to unrecognized error), well, we know exactly what the odds are – 1 in 62,500; it's what makes the case, not breaks it. And more is in the pipeline, with even more daunting odds.

Now on to some details. Some writers ([the third correspondent]) conclude that the shape of the snippet of text is subject to whim and determines which word-pairs can be found. Others ([the fourth correspondent]) assume the method is so generous that one can find anything at all. These would be serious weaknesses indeed were they as described. However:


(1) The identical search and analysis procedure is followed for Genesis and for all controls. As there is nothing in either procedure that favors any set of data, the results in Genesis should differ from the results in the controls by no more than small, chance variations. But the results in Genesis are dramatically different. The odds are vanishingly small that this exceptional difference has occurred merely by chance.

(2) The search method is precise and highly constrained. It does not depend upon a particular rectangle of text such as those in the illustrations. (These are only used to calculate minimal proximities.) One may more easily envision this by first considering a simple string of the entire text, some 78,000 letters long. The search-and-find part of the procedure is carried out on this entire text string, not on snippets of it. Roughly described, both words of a pair are located by finding where each appears at its own respective minimum equidistant letter sequence (ELS). In theory, these minimum sequences may be any number of letters (though no longer than c. 78,000 divided by one less than the number of letters in the word); in practice, they tend to be quite short relative to the total length of Genesis. But as we will see, the actual ELS does not bias the results. In a string, the proximity of two terms in a pair may be defined in any number of ways: For instance, from the beginning of the right-most word (Hebrew reads right to left) to the end of the leftmost (this would give the largest distance); from the beginning of the leftmost word to the end of the rightmost (this would give the shortest distance); between the midpoints of each, etc. Even for this simplification, defining proximity in a uniform, meaningful way for many such pairs of words, mostly of different lengths and letter spacing, requires care. It would be meaningless, for example, to treat a four-letter word spread out over the entire text (ELS of about 26,000) as "close" to every other word.

(3) The specific method used in the research to calculate proximity is more sophisticated than counting letters on a string and is designed to avoid the problems inherent in linear measures. First, the portion of Genesis that includes all of both words and everything in between is cut out from the text after the words have been found. For words that are very far apart, or have large ELS's, this is a huge piece of text. The resulting string is then wrapped into a rectangle (more precisely, a helix, forming a cylinder). The rectangle is reshaped until the two words are as close together – "compact" – as possible. Despite this manipulation, words that were far apart in the original string, or with widely spaced ELS's, can never form as compact a configuration as words with short ELS's that were close together. The most compact rectangle is used in every case. Remember, this method is applied to Genesis and all controls. The procedure therefore favors none.

(4) In addition to searching for pairs of historically connected names and dates, the researchers performed identical searches and measurements on sets of unrelated pairs, created by matching one person's name with another's date. (This aspect of the research addresses questions 9-11 of [the first letter-writer]. There are 32! (that is, 1 X 2 X 3 X ... 31 X 32 = c. 2.6 X 10 exponent 35) possible mismatched sets. From these, 999,999 were selected at random, which, along with the set of actual name-date pairs, made a total of 1,000,000 sets of pairs. The researchers allowed for variant spellings of names and alternative death dates, and also generated four different proximity statistics for every one of these 1,000,000 sets, both when running the test on Genesis and when running it on control texts. They then ranked the 1,000,000 sets in ascending order of compactness (rank 1 = most compact). In all four measures, the rank of the actual historical data, and only in the actual Genesis text, stands out starkly as many thousands of times closer to 1 than the false sets of data either in Genesis or in any control text. The proximities of the false pairings, however, fell well within the range that would be expected by chance, whether located in Genesis or in the controls.

(5) The examples shown in the article are good illustrations. But they are also fairly typical. Indeed, they are of the most common class, showing words fairly close together: More than 1/9 of the true name-date pairs are in the top 1/25 of proximities. But not only is the average proximity of names to matched dates in Genesis far smaller than expected (or than is found in controls), the distribution of the different individual proximities that make up the average is also remarkable. When charted, they do not form a bell-shaped curve, for instance, with a peak that merely yields a lower than expected mean. Nor is it a random-appearing distribution that happens to have a low average. Rather, its highest point is at zero distance, indicating that the largest number of paired words appear right beside each other, and the distribution (the number of pairs at a given proximity) drops off smoothly and rapidly as the proximity (the distance between the words) increases to the maximum possible (about 78,000). By contrast, the distribution for the controls is a flat line (a "uniform distribution") – as expected. All possible proximities between pairs are equally represented, and the average (composed of all these) is therefore about half the length of the text. This reflects the fact that in the controls the location of any word is independent of any other word – every such location being a matter of pure happenstance, as most people would presume should be true for Genesis as well, but isn't. What might the distinctive shape of the distribution indicate? First, to return to one of [the third respondent]'s and [the fourth respondent]'s concerns, one possibility is that perhaps the text we have is in some small measure not precisely the original, though it must be close to it. Because of the aggregate nature of the phenomenon, introducing more and more small errors into the text will slowly degrade the robustness of the findings, but won't entirely efface them – until a certain critical degree of error is exceeded. (Studies by outside experts have already begun to quantitate this.) Perhaps (some of ) the greater than zero proximities are due to such errors having slowly crept into the text. However this cannot account entirely for the peculiar spread of proximities, since if there were no other additional cause(s), the distribution would consist of one sharp spike at zero proximity, and a random, approximately equal scattering of other proximities, small and large, caused by the errors. Second, since more than one death date for the same person can't be correct, we know that at least some of the spread must be due to inaccuracies in the historical data. But here, too, such errors alone (or in combination with textual errors) would leave a sharp spike at zero against a random scattering of other proximities. So, there must be an additional reason for the spread. (My own guess is that the phenomenon is intrinsically probabilistic – as is the ultimate reality it points to. Though counter to our intuitive understanding of, say, predestination, similarly strange ideas have unexpectedly been found at the foundation of the physical world as well.) [The first letter-writer] has well expressed the balance of both skepticism and seriousness that the unusually high quality of this research demands. Rigorous investigations of the present research are being published by independent outside experts; one such piece (by Harold Gans, a senior mathematician with the U.S. Department of Defense) not only confirms the original findings using different techniques, it also shows that the cities of birth and death of the rabbis in the Witztum et al. study are also encoded in Genesis. The odds that the results of this new study occurred by chance are less than 1 in 200,000. Witztum et al. have themselves submitted a new paper, on a completely unrelated data set, also tested in Genesis, with odds of 1 in 250,000,000. There are many (myself among them) who would like nothing more than for the results not only to continue to hold up, but to be extended further. Nonetheless, if the work is in error, it would be best for this to be demonstrated not just quickly but well. Casual dismissal can no more accomplish the latter than uncritical acceptance will accomplish the former. In a recent interview, David Kazhdan, professor and chairman of mathematics at Harvard, cautioned hasty skeptics of the Torah codes, "The phenomenon is real. What conclusion you reach from this is up to the individual."

—Dr. Jeffrey B. Satinover, February 1996
—Reprinted with permission

Divine Authorship? is ©1995 by Dr. Jeffrey Satinover, and was first printed in Bible Review Vol. IX No. 5.
This response to questions was printed in Bible Review Vol. X No. 1.
