Many languages (Erlang, Haskell, Scala, etc.) have a compiler optimization that makes recursion fast and memory constant if the function is tail recursive modulo-cons. They do this by altering/replacing the contents of the current stack and jumping to the function called instead of growing the stack. This makes this kind of recursion act as if it were some kind of loop where the loop body can be swapped in with new code and the stop condition can be controlled by a state machine rather than a simple condition.
That said, implementing fibonacci recursively as fib(n) = fib(n-1) + fib(n-2) isn't tail recursive modulo-cons. This means that Erlang, Haskell and Scala invoke the first recursive call and grow the stack. The tail-recursive version of fibonacci is essentially identical to the tranditional imperative looping version if observed in a trace.
Tail recursion optimization seems like no big deal on first glance, but is actually very useful for code structuring. It lets you structure programs directly as state machines by using tail calls for the state transitions. This is very convenient and helps code clarity. I think this is the key advantage.