Text to Speech by using ttsvoice - Python
In this tutorial, we will convert the text written by a human into human-like speech.
Listening to the information is much more understandable than reading it. It involves the person more in the content rather than reading it. There are many APIs in Python for converting text in human language into human voice. This includes pyttsx3, gTTS, espeak, etc.
The process in which a text string is converted to voice, i.e., the text will speak the words in the English language, is called Text to Speech (TTS).
ttsvoice Library in Python
ttsvoice is a Python package used for Text to Speech conversion. It contains multiple packages like gTTS and pyttsx3, which convert human text into voice.
Installing the ttsvoice library in Python
The ttsvoice library can be directly installed into the command prompt or by making a Python environment by running this command:
py –m pip install ttsvoice
Text to Speech Conversion using pyttsx3 API
Features of pyttsx3 API
- It is a straightforward tool to use that converts text into speech.
- It does not require any internet connection; it works offline.
- It works with both Python 2 and Python 3.
- It can convert text into speech in both female and male voices.
Installing pyttsx3 API in the system using a command prompt or any Python terminal
The pyttsx3 API can be directly installed into the command prompt or by making a Python environment by running this command:
py –m pip install pyttsx3
This library depends on win32; it may produce an error while running the program. Thus, to avoid this, install pypiwin32 in the Python environment by using the following:
py –m pip install pypinwin32
Functions in pyttsx3 Library
- pyttsx3.init(): This function takes an identifier to an instance of an engine that will occupy the specified driver. If another engine instance is already using the available driver, the other engine instance is returned else. A new engine is built.
- getProperty(): This function will take the present value of the engine property.
- setProperty(): It sets an engine property by queuing a command. The new property will affect all expressions or words queued in the engine.
- say(): This function helps in speaking the expression given by the user
- runAndWait(): This function will stop all the queued commands. It calls callbacks for engine notifications. It returns once the queue has been cleared of all orders waiting before the call.
Three TTS engines are supported by pyttsx3:
- sapi5: it is used on Windows.
- NSSpeechSynthesizer: it is used on Mac OS
- espeak: Can be used on any platform
Let’s understand the use of pyttsx3 API using a simple example.
Code
import pyttsx3 obj = pyttsx3.init() # object of Text-to-speech engine txt = "Hello, I am converting the Text into Speech." print (txt) obj.say(txt) # converting text to speech obj.runAndWait() # play the speech
Output
As an output of this code, the written text will be heard in the human voice as:
“Hello, I am converting the Text into Speech.”
The pyttsx3 library is imported in this code, and an object obj is made, which initializes the pyttsx library. Then a text is written, and when obj.say() is called, it will speak the text, and obj.runAndWait() will run the command till all the commands are queued up.
Implementing different functions and properties in pyttsx3
1. Speaking rate
The Speaking rate can be defined as the speed of speaking. We can check the details of the speaking rate using the getProperty function. It can be written as:
rate= obj.getProperty("rate") print(rate)
Output
200
We can change the rate by using the setProperty function. It can be used as:
obj.setProperty("rate", 300) obj.say(txt) obj.runAndWait()
Output
It will speak the words faster than before. To slower the speed, we can change the value to 100.
2. Voice Details
The details of different voices available can be obtained by: voices = obj.getProperty("voices") print(voices)
Output
[, ]
The given output gives voice objects for females and males.
3. Converting Voices
We can hear the text in both male and female voices using setProperty and the abovementioned voices. It can be generated by:
The male and female voices are denoted by 0 and 1, respectively.
For generating the text in a male voice, we can use the following:
obj.setProperty('voice', voices[0].id) obj.say(txt) obj.runAndWait()
We have set the voice id to 0 using the setProperty() function. For the male voice, it is 0.
For generating the text in the female voice, we can use the following:
obj.setProperty('voice', voices[1].id) obj.say(txt) obj.runAndWait()
In this, we have set the voice id to 1 using the setProperty() function. For the female voice, it is 1.
We can convert the voice by giving voice id to the setproperty() function.
To get the voice ids, we will run a for loop, which will give the details of the voices present in our system:
For voice in voices:
# to get the info. about various voices in our PC print("Voice:") print("ID: %s" %voice.id) print("Name: %s" %voice.name) print("Age: %s" %voice.age) print("Gender: %s" %voice.gender) print("Languages Known: %s" %voice.languages)
Output:
Voice: ID: HKEY_LOCAL_MACHINESOFTWAREMicrosoftSpeechVoicesTokensTTS_MS_EN-US_DAVID_11.0 Name: Microsoft David Desktop - English (United States) Age: None Gender: None Languages Known: [] Voice: ID: HKEY_LOCAL_MACHINESOFTWAREMicrosoftSpeechVoicesTokensTTS_MS_EN-US_ZIRA_11.0 Name: Microsoft Zira Desktop - English (United States) Age: None Gender: None Languages Known: []
We have got two voices with different ids.
For generating the text in a male voice, we can use the following:
voice_id: “HKEY_LOCAL_MACHINESOFTWAREMicrosoftSpeechVoicesTokensTTS_MS_EN-US_DAVID_11.0” obj.setProperty('voice', voices_id) obj.say(txt) obj.runAndWait()
For generating the text in the female voice, we can use the following:
voice_id: “ HKEY_LOCAL_MACHINESOFTWAREMicrosoftSpeechVoicesTokensTTS_MS_EN-US_ZIRA_11.0” obj.setProperty('voice', voice_id) obj.say(txt) obj.runAndWait()
Text to Speech conversion using gTTS API
This package is used to convert human text into voice in different languages. The languages included are Hindi, English, Tamil, French, German, etc.
Installing gTTS API in the system using a command prompt or any Python terminal
The gTTS API can be directly installed into the command prompt or by making a Python environment by running this command:
py –m pip install gTTS
Along with this, we need to install some packages like playsound and pyttsx3:
py –m pip install playsound py –m pip install pyttsx3
This package can be used on any platform.
Let’s understand the use of gTTS API using a simple example.
Code
from gtts import gTTS from playsound import playsound txt = "Hello, I am converting the Text into Speech" language = 'en' obj = gTTS(text=txt, lang=language, slow=False) obj.save("audio.mp3") playsound("audio.mp3")
Output
The audio file is saved as “audio.mp3.”
In this, gtts and playsound library is imported, an obj is made in which gTTS is called, and language is set as en, and then an mp3 file is saved using the obj.save() function. Then it is played using the playsound() function.
We can change the language by giving different languages and can increase the speed by changing the slow= False to True.