The New York Times has made a cool tool that lets you play with the Laurel/Yanny phenomenon. Try it.
Yohan John has written an explanation of the effect.
And finally, here is Rachel Gutman in The Atlantic:
When you speak, you’re producing sound waves that are shaped by the length and shape of your vocal tract, which includes your vocal folds (vocal cords is a misnomer), throat, mouth, and nose. Linguists can study these sound waves and separate them out into their component frequencies, and display them in something called a spectrogram. Here’s the spectrogram for the yanny/laurel recording:
Higher frequencies (up to 5,000 hertz, or waves per second) appear toward the top, and lower ones (down to zero) toward the bottom. The dark bands are called formants; they’re the resonant frequencies of the vocal tract, and they depend on the length and shape of your vocal tract—i.e., all the space between your vocal folds, where the sound waves begin, and your mouth and nose, where they’re released.
The length of your vocal tract depends mostly on physiology: Women’s vocal folds tend to be higher up, so their tracts are shorter. The shape is largely based on where you put your tongue, like when you place the tip of your tongue between your teeth to make a th sound. By moving your tongue around in your mouth and opening and closing your lips, you change the sounds you’re making, and the formants you see in the spectrogram.
Chelsea Sanker, a phonetician at Brown University, looked at the spectrogram above to help me figure out what was going on.
More here.