When inter-networking was first being invented, the functions of today's TCP and IP were combined in a single protocol, also called TCP.
Research on sending packet voice on the ARPANET dated back at least to 1972, and one of the key reasons for separating IP from TCP was the realization that packet voice and similar applications did not have to be reliable. There was no reason to re-transmit packets with bad speech data. Speech data must be processed at the same rate as it is sent -- there is no time to retransmit packets with errors.
The solution was to split IP out, making it responsible solely for best-effort deliver of packets to the host with a given IP address. Error checking would be left to TCP.
For applications that did not require the reliability of TCP, a stripped down transport protocol called User Datagram Protocol (UDP) was invented. UDP is used in VoIP, teleconferencing and other isochronous applications.
VoIP remained a research topic until February 1995 when a small Israeli company, VocalTec Communications, released a program called InternetPhone. Hobbyists began making free, low quality, calls, but InternetPhone soon had imitators and they all improved rapidly.
Today, technology has improved to the point that VoIP is commercially viable, and it will gradually replace circuit-switched voice in the future.
Here are the original Requests for Comment (RFCs) on these key protocols: