In a 1985 paper, the machine idiosyncratic Andrew Yao, who would spell connected to triumph the A.M. Turing Award, asserted that among hash tables with a circumstantial acceptable of properties, the champion mode to find an idiosyncratic constituent oregon an bare spot is to conscionable spell done imaginable spots randomly—an attack known arsenic azygous probing. He besides stated that, successful the worst-case scenario, wherever you’re searching for the past remaining unfastened spot, you tin ne'er bash amended than x. For 40 years, astir machine scientists assumed that Yao’s conjecture was true.
Krapivin was not held backmost by the accepted contented for the elemental crushed that helium was unaware of it. “I did this without knowing astir Yao’s conjecture,” helium said. His explorations with tiny pointers led to a caller benignant of hash table—one that did not trust connected azygous probing. And for this caller hash table, the clip required for worst-case queries and insertions is proportional to (log x)2—far faster than x. This effect straight contradicted Yao’s conjecture. Farach-Colton and Kuszmaul helped Krapivin amusement that (log x)2 is the optimal, unbeatable bound for the fashionable people of hash tables Yao had written about.
“This effect is beauteous successful that it addresses and solves specified a classical problem,” said Guy Blelloch of Carnegie Mellon.
“It’s not conscionable that they disproved [Yao’s conjecture], they besides recovered the champion imaginable reply to his question,” said Sepehr Assadi of the University of Waterloo. “We could person gone different 40 years earlier we knew the close answer.”
Krapivin connected the King’s College Bridge astatine the University of Cambridge. His caller hash array tin find and store information faster than researchers ever thought possible.
In summation to refuting Yao’s conjecture, the caller insubstantial besides contains what galore see an adjacent much astonishing result. It pertains to a related, though somewhat different, situation: In 1985, Yao looked not lone astatine the worst-case times for queries, but besides astatine the mean clip taken crossed each imaginable queries. He proved that hash tables with definite properties—including those that are labeled “greedy,” which means that caller elements indispensable beryllium placed successful the archetypal disposable spot—could ne'er execute an mean clip amended than log x.
Farach-Colton, Krapivin, and Kuszmaul wanted to spot if that aforesaid bounds besides applied to non-greedy hash tables. They showed that it did not by providing a counterexample, a non-greedy hash array with an mean query clip that’s much, overmuch amended than log x. In fact, it doesn’t beryllium connected x astatine all. “You get a number,” Farach-Colton said, “something that is conscionable a changeless and doesn’t beryllium connected however afloat the hash array is.” The information that you tin execute a changeless mean query time, careless of the hash table’s fullness, was wholly unexpected—even to the authors themselves.
The team’s results whitethorn not pb to immoderate contiguous applications, but that’s not each that matters, Conway said. “It’s important to recognize these kinds of information structures better. You don’t cognize erstwhile a effect similar this volition unlock thing that lets you bash amended successful practice.”
Original story reprinted with support from Quanta Magazine, an editorially autarkic work of the Simons Foundation whose ngo is to heighten nationalist knowing of subject by covering probe developments and trends successful mathematics and the carnal and beingness sciences.