Data tends to be high-volume, not particularly sensitive to latency, and using TCP has a mechanism to cope with lost packets.
Voice on the other hand is low volume and made up of lots of very small packets and *is* sensitive to latency and has no correction mechanism for lost packets.
It's therefore considered sensible to use VLANs to keep these two quite different traffic profiles separate from each other.
On a very small network it might be OK to put everything on a single VLAN, but as a rule of thumb, it's probably a good idea to consider a /24 to be the ideal subnet size. As you are breaking up Data subnets from each other, you may as well do the same with Voice subnets.