This essay (serialized here across 24 separate posts) uses words and numbers to discuss the uses of words and numbers — particularly examining evaluations of university degrees that employ statistical data to substantiate competing claims. Statistical analyses are crudely introduced as the mode du jour of popular logic, but any ratiocinative technique could likely be inserted in this re-fillable space and applied to create and defend categories of meaning with or without quantitative support. Questions posed across the series include: Is the data informing or affirming what we believe? What are the implications of granting this approach broader authority? The author, Melanie Williams, graduated from UA in 2006, with a B.A. in Anthropology and Religious Studies.
“”We may regard the present state of the universe as the effect of its past and the cause of its future. An intellect which at any given moment knew all of the forces that animate nature and the mutual positions of the beings that compose it, if this intellect were vast enough to submit the data to analysis, could condense into a single formula the movement of the greatest bodies of the universe and that of the lightest atom; for such an intellect nothing could be uncertain and the future just like the past would be present before its eyes.”
Complex systems have historically defied attempts at deterministic approaches, not least because the feedback concept of a dynamic system introduces chance, even in an environment in which other variables may be controlled. “Laplace’s demon,” then, is considered an outmoded approach to physics, but the philosophy of determinism can still be seen in the data-gathering approach of the Digital Age. The goal needn’t be a harmonious unified theory. It is enough to take the idea of a lossless network of causation to move along to the next question, heuristically, Bayesian-like, keeping what works and discarding what doesn’t. When Chris Anderson states that the usefulness of conceptual models may be waning in proportion to our solid-state storage capacities, he has cause to believe in the possibility. Anderson himself explicitly refutes the need to determine causality – “Correlation is enough” – but the processes of pattern-seeking we design algorithms to perform implies causation of a sort – else there is nothing but a coincidence of events, for which we must invent some other meaning. The sea change ushered in by new technologies that have allowed us to observe, measure, and test what was formerly theoretical can and has provided extraordinary ways to move forward in many lines of inquiry. The field of what is possible has been expanded by the Big Data approach, hence its promise and popularity, within limits. Many proponents are not so eager as Anderson to throw models out the window, and since an algorithm must be programmed to “seek” “patterns” and many data analyses rely on the Law of Large Numbers, it would seem any claim to abandon models altogether is problematic in itself. I am more interested, though, in the notion that collecting enough data to subject to an algorithm can slowly flesh out a “truth” we wouldn’t otherwise investigate, much as “all the forces that animate nature and mutual positions of the beings that compose it” could be submitted to an analysis that might yield a single formula in which past and future uncertainty is diminished to the point of oblivion. This premise – not unique to the Big Data movement – suggests that “randomness,” or the absence of a discernible pattern, is the absence of sufficient data to discern the pattern. The more data we can compile, the more precisely we can define a system’s limits and averages, the more confidence we may have in calculating, probabilistically, the system’s behavior. How much data do we gather? As much as the growing bellies of our devices will hold. Like the Cookie Monster, who never was satiated, yet through the magic of palatable public television, also never vomited.
Part 17 coming tomorrow morning…