.NET Tutorials, Forums, Interview Questions And Answers
Welcome :Guest
Sign In
Win Surprise Gifts!!!

Top 5 Contributors of the Month
Melody Anderson

Home >> Articles >> General >> Post New Resource Bookmark and Share   

 Subscribe to Articles

How to build a Text To Speech Voice Generator

Posted By:Evan Miller       Posted Date: July 06, 2011    Points: 200    Category: General    URL: http://www.dotnetspark.com  

This article is about how you can buils a simple text to speech voice generator the easiest.


Voice over IP (VoIP) VoIP (or Voice over Internet Protocol) refers to voice transmission over the Internet. In other words it refers to telephony over the Internet. Its main advantage against traditional telecommunication is that it is based on the existing technical structure. Due to this fact, a VoIP communication system is far less expensive than a traditional telecommunication system with similar functionality. 

VoIP technology uses Session Initiation Protocol (SIP) mainly due to its easy implementation. Since SIP can be implemented in the easiest way it overcomes all other protocols that could be used in VoIP.


Text to speech functionality means that the system converts normal (not mathematical or technical) texts into voice. These systems are based on the so called Speech synthesis. These systems structure the voice from the given input text on the basis of sample data and the characteristics of human voice.

Sample programs

This tutorial includes two sample programs in Visual Basic that demonstrate the utilization of text to speech functionality in VoIP. The first sample program is called "TextToSpeechSDK_Sample" (Figure 1). It has a text field and it converts texts entered in this text field into voice. Then you can play the voice and/or save it as a wav file.

The other program is called "SDK_VoicePlayerSample" (Figure 2). It is able to play the previously saved voice during a VoIP call. After it plays the wav file it finishes the call.


It is quite simple to develop an application that is only responsible for converting an input text into voice, especially if you use a recent speech API. In this sample program the tools offered by Microsoft have been used. In this way the source of the program can be made via a few lines. The tool can be found on System.Speech namespace. For conversion only a SpeechSynthesizer object is needed that will make the text to speech conversion.
Figure 1 - Text To Speech SDK Sample

  1. Private Sub buttonSave_Click(ByVal sender As Object, ByVal e As EventArgs) Handles buttonSave.Click 
  2.  If Not String.IsNullOrEmpty(Me.textBox1.Text) Then 
  3.  Using file As SaveFileDialog = New SaveFileDialog 
  4.  file.Filter = "Wav audio file|*.wav" 
  5.  file.Title = "Save an Wav audio File" 
  6.  If (file.ShowDialog = DialogResult.OK) Then 
  7.  Me.synthetizer.SetOutputToWaveFile(file.FileName, New SpeechAudioFormatInfo(8000, AudioBitsPerSample.Sixteen, AudioChannel.Mono)) 
  8.  Me.synthetizer.Speak(Me.textBox1.Text) 
  9.  Me.synthetizer.SetOutputToDefaultAudioDevice() 
  10.  MessageBox.Show("File saving completed."
  11.  End If 
  12.  End Using 
  13.  End If 
  14.  End Sub 

The conversion and saving can be performed by the code above. Essentially, the whole program requires only a few lines of code. It can be seen from the source code that a file is opened for writing then the output of the mentioned speech.dll SpeechSynthesizer object is set to this file.

To let the other application process the saved sound I used 800 Mhz sampling frequency and 16 bit rate with mono audio channel. The conversion is made via the Speak method ofSpeechSynthesizer object with the help of a given input text(Me.synthetizer.Speak(Me.textBox1.Text)) that will appear on the oupt, which was set onSpeechSynthesizer


Basically this program is a simplified VoIP SIP softphone (Figure 2). Since this softphone is for demonstration it only has dialing and call initializing functionality. In case of successful call, the application plays the sound, which was saved by the other program, during the phone call and then it ends the call. For making it simple Ozeki VoIP SIP SDK has been used for this softphone. Ozeki VoIP SIP SDK then also can be used for creating a more complex VoIP phone with much more functions even with a few lines of source code.

Figure 2 - SDK Voice Player Sample

Running the program

When the application is started, it automatically tries to register to the SIP PBX based on the given parameters. In case of successful registration, an "online" caption appears on the display that denotes the ready-to-call status of the program. For making a call you only need to dial the phone number and click the call button. Then open the previously recorded wav file. If the wav file can be loaded successfully, the program starts the call towards the dialed party. When the call is established the program plays the sound data to the other party with the help of a timer. After playing the whole sound data the program ends the call and you can start a new call.

Source code

For being simple, the program misses to use any design samples and other conventions. Therefore the full source code can be found in FormSoftphone.cs file that is related to the interface. 

By using Ozeki VoIP SIP SDK only a few objects and the handling of their events are needed to create the whole functionality of a complex softphone. The following objects have been used in this program:

  1. Private phoneCall As IPhoneCall 
  2.  Private mediaTimer As MediaTimer 
  4.  Private phoneLine As IPhoneLine 
  5.  Private phoneLineInformation As PhoneLineInformation 
  6.  Private softPhone As ISoftPhone 
  7.  Private wavReader As ozWaveFileReader 

IphoneLine. There can be more telephone lines which mean that you can develop a multi line phone. For simplicity this example only uses one telephone line.

It represents a telephone line that can be registered to a SIP PBX, for example Asterisk, 3CX, or maybe to other free PBXs that are offered by SIP providers. Registration is made via a SIP account. 

It is an enum type that represents the status of the phone line with the PBX. For example, registered, not registered, successful/unsuccessful registration. 

It represents a call: the status of the call, the direction of the call, on which telephone line it was created, the called person, etc.

It is a Timer that ensures a more accurate timing than Microsoft .Net Timer.

It extends the Microsoft.Net Stream type to simplify the reading and processing of Wav audio files. After the program is started, it automatically registers to a previously specified SIP PBX server. This is made via the InitializeSoftPhone() method that is called in the 'Load' event handler of the interface.

  1. Private Sub InitializeSoftPhone() 
  2.  Try 
  3.  Me.softPhone = SoftPhoneFactory.CreateSoftPhone("", 5700, 5760, 5700, Nothing) 
  4.  Me.phoneLine = Me.softPhone.CreatePhoneLine(New SIPAccount(True, "oz891""oz891""oz891""oz891""", 5060)) 
  5.  AddHandler Me.phoneLine.PhoneLineInformation, New EventHandler(Of VoIPEventArgs(Of PhoneLineInformation))(AddressOf Me.phoneLine_PhoneLineInformation) 
  6.  Me.softPhone.RegisterPhoneLine(Me.phoneLine, Nothing) 
  7.  Me.mediaTimer = New MediaTimer 
  8.  Me.mediaTimer.Period = 20 
  9.  AddHandler Me.mediaTimer.Tick, New EventHandler(AddressOf Me.mediaTimer_Tick) 
  10.  Catch ex As Exception 
  11.  MessageBox.Show(String.Format("You didn't give your local IP adress, so the program won't run properly." & ChrW(10) & " {0}", ex.Message), String.Empty, MessageBoxButtons.OK, MessageBoxIcon.Hand) 
  12.  End Try 
  13.  End Sub 

Where the softphone object has been instanced with the network parameters of the running computer. Please note that if you do not update these parameters (do not modify the IP address) according to your own computer, the program will not be able to register onto the given SIP PBX. The parameters of the object are the follows: IP address of the local computer, the minimum port to be used, the maximum port to be used, the port that is assigned to receive SIP messages.

Create a phoneLine with a SIP account that can be a user account of your corporate SIP PBX or a free SIP provider account. In order to display the status of the created phoneline, subscribe to its 'phoneLine.PhoneLineInformation' event.

Then you only need to register the created 'phoneLine' onto the 'softPhone'. In this example only one telephone line is registered but of course multiple telephone lines can also be registered and handled with Ozeki VoIP SIP SDK. After the phoneline registration was successful the application is ready to load sound data and to call a specified phone number.

Making an outgoing call

Outgoing calls can be made by entering the phone numbers to be called and clicking the 'Call' button.

  1. Private Sub buttonPickUp_Click(ByVal sender As Object, ByVal e As EventArgs) Handles button13.Click 
  2.  If (String.IsNullOrWhiteSpace(Me.labelDialingNumber.Text)) Then 
  3.  MessageBox.Show("You haven't given a phone number."
  4.  Return 
  5.  End If 
  6.  If ((Me.phoneCall Is Nothing)) Then 
  7.  If ((Me.phoneLineInformation <> phoneLineInformation.RegistrationSucceded) AndAlso (Me.phoneLineInformation <> phoneLineInformation.NoRegNeeded)) Then 
  8.  MessageBox.Show("Phone line state is not valid!"
  9.  Else 
  10.  Using openFileDialog As OpenFileDialog = New OpenFileDialog 
  11.  openFileDialog.Multiselect = False 
  12.  openFileDialog.Filter = "Wav audio file|*.wav" 
  13.  openFileDialog.Title = "Open a Wav audio File" 
  14.  If (openFileDialog.ShowDialog = DialogResult.OK) Then 
  15.  Me.wavReader = New ozWaveFileReader(openFileDialog.FileName) 
  16.  Me.phoneCall = Me.softPhone.CreateCallObject(Me.phoneLine, Me.labelDialingNumber.Text, Nothing) 
  17.  Me.WireUpCallEvents() 
  18.  Me.phoneCall.Start() 
  19.  End If 
  20.  End Using 
  21.  End If 
  22.  End If 
  23.  End Sub 

By clicking the 'Call' button the file loader window appears. In this window you can select the wav audio file that you want to play into a phone call. The call is made via the IPhoneCall object by Ozeki VoIP SIP SDK. In this way you need to create such a call object in the registered phoneline of the softphone.

To make a successful call you need to subscribe to some events (Me.WireUpCallEvents()). With the help of these events the application will receive information about the changes that occur during the call. After subscribing to these events you only need to invite the Start() method on the IPhoneCall object that represents the call. As a result the call starts to be established.

  1. Private Sub WireUpCallEvents() 
  2.  AddHandler Me.phoneCall.CallStateChanged, New EventHandler(Of VoIPEventArgs(Of CallState))(AddressOf Me.call_CallStateChanged) 
  3.  AddHandler Me.phoneCall.CallErrorOccured, New EventHandler(Of VoIPEventArgs(Of CallError))(AddressOf Me.call_CallErrorOccured) 
  4.  End Sub 

CallStateChanged event is for displaying the changes of the call status. The call statuses can be the follows (Setup, Ring, Incall, Completed, Rejected).

  1. Private Sub call_CallStateChanged(ByVal sender As Object, ByVal e As VoIPEventArgs(Of CallState)) 
  2.  Me.InvokeGUIThread(Sub() 
  3.  Me.labelCallStatus.Text = e.Item.ToString 
  4.  End Sub) 
  5.  Select Case e.Item 
  6.  Case CallState.InCall 
  7.  Me.mediaTimer.Start() 
  8.  Exit Select 
  9.  Case CallState.Completed 
  10.  Me.mediaTimer.Stop() 
  11.  Me.phoneCall = Nothing 
  12.  Me.InvokeGUIThread(Sub() 
  13.  Me.labelDialingNumber.Text = String.Empty 
  14.  End Sub) 
  15.  Exit Select 
  16.  Case CallState.Cancelled 
  17.  Me.phoneCall = Nothing 
  18.  Exit Select 
  19.  End Select 
  20.  End Sub 

In this case only 'Incall', 'Cancelled' and 'Completed' call states are important. After initializing the call, it gets into 'Setup' state. When the called party accepts the call we receive a notification about it via 'CallStateChanged' event that returns the new state in parameters. If the new state is 'InCall' (so the telephone was picked up) the wav audio file starts to be sent to the other party.

Since the participants of VoIP communication send out a defined amount of sound data within a defined time period, I do not send out the whole sound data at once. With the help of a MediaTimer 320 byte of sound data is sent out in every 20ms periods via the 'SendMediaData'method of of the call object:

  1. Private Sub mediaTimer_Tick(ByVal sender As Object, ByVal e As EventArgs) 
  2.  If (Not Me.wavReader Is Nothing) Then 
  3.  Dim data As Byte() = New Byte(320 - 1) {} 
  4.  If (Me.wavReader.Read(data, 0, 320) = 0) Then 
  5.  Me.phoneCall.HangUp() 
  6.  Else 
  7.  Me.phoneCall.SendMediaData(VoIPMediaType.Audio, data) 
  8.  End If 
  9.  End If 
  10.  End Sub 

If the application plays the whole file, it hang-ups the call via the (Me.phoneCall.HangUp())call. 

call.CallErrorOccured event notifies about the reasons that prevents the establishment. Such reason is, for example, when the called party is busy, the call was rejected, the called number does not exist or it is unavailable.

 Subscribe to Articles


Further Readings:


No response found. Be the first to respond this post

Post Comment

You must Sign In To post reply
Find More Articles on C#, ASP.Net, Vb.Net, SQL Server and more Here

Hall of Fame    Twitter   Terms of Service    Privacy Policy    Contact Us    Archives   Tell A Friend