Speech Module in Python: Converting text to speech, known as Speech Synthesis, this process is the computer-generated recreation of human speech. This module converts the human language text into human-like speech audio.

In this article, we will discuss how to convert text to speech in Python language. We will not be developing any neutral networks nor training the model to achieve any results. Instead, we will use some APIs and engines that offer the facility to convert text into speech in Python. There are many APIs that have this quality, and among them, one of the most used services is Google text to Speech, an online library. In contrast, another library we will discuss is pyttsx3, which is an offline library of Python.

To get started with Python online library, that is gTTS (Google Text To Speech) library, by installing it using PIP:

 pip3 install gTTS              # Google text to speech library
 pip3 install playsound

Online Text to Speech Module

The gTTS is the python library used for interfacing with Google translate’s text to speech API. This library only works with an internet connection, and this is very easy and simple to use.

Open the new Python file and Import:

 import gtts             # Google text to speech library
 from playsound import playsound

We need to pass the text to the gTTS (Google text to speech library) module to use this library. What is the interface to Google translate's text to speech API:

For Example:

 # First, we have made the request to google to get synthesis
 textts = gtts.gTTS ( " Hello Programmers " )

Till now, we have sent the text and recovered the original speech from the API (Application Programming Interface), Now we will save this audio to the file:

 # we are saving the audio into a file
 textts.save( " JTP.mp3 " )

Now, we can notice that a new file has visible in the current directory. We can play it by using the playsound module, which we have installed earlier.

 # we will play the audio file
 playsound ( " JTP.mp3 " )

And now, we can hear a robot speaking what we just asked it to say. We can use it for other languages also, by passing the lang parameter:

 # for example, in spanish
 textts = gtts.gTTS ( " Hola Española " , lang = " es " )
 textts.save ( " spanish.mp3 " )
 playsound ( " spanish.mp3 " )

If the user does not want to save the audio in the file and wants to play it directly, they can use textts.write_to_fp(), which will accept io.BytesIO() object to write into it.

User can see the available languages by using this:

 # to see all available languages along with their IETF tag
 Print ( gtts.lang.tts_langs ( ) )

Here are the supported languages:

Output:

{ ' af ' : ' Afrikaans ' , ' sq ' : ' Albanian ' , ' ar ' : ' Arabic ' , ' hy ' : ' Armenian ' , ' bn ' : ' Bengali ' , ' bs ' : ' Bosnian ' , ' ca ' : ' Catalan ' , ' hr ' : ' Croatian ' , ' cs ' : ' Czech ' ,  ' da ' : ' Danish ' , ' nl ' : ' Dutch ' , ' en ' : ' English ' , ' eo ' : ' Esperanto ' , ' et ' : ' Estonian ' , ' tl ' : ' Filipino ' , ' fi ' : ' Finnish ' , ' fr ' : ' French ' , ' de ' : ' German ' , ' el ' : ' Greek ' , ' gu ' : ' Gujarati ' , ' hi ' : ' Hindi ' , ' hu ' : ' Hungarian ' , ' is ' : ' Icelandic ' , ' id ' : ' Indonesian ' , ' it ' : ' Italian ' , ' ja ' : ' Japanese ' , ' jw ' : ' Javanese ' , ' kn ' : ' Kannada ' , ' km ' :  ' Khmer ' , ' ko ' : ' Korean ' , ' la ' : ' Latin ' , ' lv ' : ' Latvian ' , ' mk ' : ' Macedonian ' , ' ml ' : ' Malayalam ' , ' mr ' : ' Marathi ' , ' my ' : ' Myanmar ( Burmese ) ' , ' ne ' : ' Nepali ' , ' no ' : ' Norwegian ' , ' pl ' : ' Polish ' , ' pt ' : ' Portuguese ' , ' ro ' : ' Romanian ' , ' ru ' : ' Russian ' , ' sr ' : ' Serbian ' , ' si ' : ' Sinhala ' , ' sk ' :  ' Slovak ' , ' es ' : ' Spanish ' , ' su ' : ' Sundanese ' , ' sw ' : ' Swahili ' , ' sv ' : ' Swedish ' , ' ta ' : ' Tamil ' , ' te ' : ' Telugu ' , ' th ' : ' Thai ' , ' tr ' : ' Turkish ', ' uk ' : ' Ukrainian ' , ' ur ' : ' Urdu ' , ' vi ' : ' Vietnamese ' , ' cy ' : ' Welsh ' , ' zh-cn ' : ' Chinese ( Mandarin / China ) ' , ' zh -tw ' : ' Chinese ( Mandarin / Taiwan ) ' , ' en – us ' : ' English ( US ) ' , ' en - ca ' : ' English ( Canada ) ' , ' en -uk ' :  ' English ( UK ) ' , ' en -gb ' : ' English ( UK ) ' , ' en -au ' : ' English ( Australia ) ' , ' en -gh ' : ' English ( Ghana ) ', ' en -in ' : ' English ( India ) ' , ' en -ie ' : ' English ( Ireland ) ' , ' en -nz ' : ' English ( New Zealand ) ' , ' en -ng ' : ' English ( Nigeria ) ' , ' en -ph ' : ' English ( Philippines ) ' , ' en- za ' : ' English ( South Africa ) ' , ' en -tz ' : ' English ( Tanzania ) ' , ' fr -ca ' : ' French ( Canada ) ' , ' fr -fr ' : ' French ( France ) ' , ' pt -br ' :  ' Portuguese ( Brazil ) ' , ' pt -pt ' : ' Portuguese ( Portugal ) ' , ' es -es ' : ' Spanish ( Spain ) ' , ' es -us ' : ' Spanish ( United States ) ' }

Offline Text to Speech

We know how to use Google Text To Speech API, but what if we want to convert the text to speech without an internet connection. Well, the pyttsx3 library is used for that purpose. This is a library of python which is used for converting text to speech. This library looks for a TTS engine, which is pre-install in the out platform and uses them for conversion.

The following are the text-to-speech synthesizers that the pyttsx3 library uses:

SAPI5 on Windows XP, Windows Vista, 8, 8.1 and 10
espeak on Ubuntu Desktop Edition 8.10, 9.04 and 9.10
NSSpeechSynthesizer on Mac OS X 10.5 and 10.6

The main features of the pyttsx3 library are:

This library works totally offline.
User can choose between various voices that are installed on their system.
This library can control the speed of speech
It can tweak the volume
It can save the speech audio into the file

If the user is using this library on a Linux operating system and their voice output is not working with the pyttsx3 library, then they have to install espeak, ffmpeg, and libespeak1.

 $ sudo apt update && sudo apt install espeak
 install ffmpeg
 install libespeak1

To start using this library, we have to open the new python file and import the library in it:

 # importing the text to speech library of python
 import pyttsx3

Now, we have to initialize the text-to-speech engine of the system:

 # we are initializing the Text-to-speech engine of the system
 engine_system = pyttsx3.init ()

For converting the text, we have to use say() and runAndWait() methods:

 # for converting the following text to speech
 text_speech = " Python is a simple and most popular programming language "
 engine_system.say ( text_speech )
 # to play the speech
 engine_system.runAndWait ()

The say() function adds the sound to the speak at the event queue, whereas, runAndWait() function runs the real event loop while waiting for all the commands to queue up.

So, we can call the say() function numerous times and then run the runAndWait() function in a single command in the end in order to hear the synthesis.

This library has some properties that the user can tweak depending on their requirements.

For example:

Let’s see the details of the speaking rate:

 # let’s see the details of the speaking rate
 rate = engine_system.getProperty ( "rate" )
 print ( rate )

Output:

Now, let’s change the speaking rate to 300, which will make the rate much faster.

 # to set the new voice rate to make it faster
 engine_system.setProperty ( "rate" , 300 )
 engine_system.say ( text )
 engine_system.runAndWait ()

we can also set it to 100, which will make it slower:

 # to set the speaking rate to make it slower
 engine_system.setProperty ( "rate" , 100 )
 engine_system.say ( text_speech )
 engine_system.runAndWait ( )

Another useful functionality of this library is voices, by which the user can see the details about all the voices available on their system.

 # to see the details of all voices available on the system
 voices = engine_system.getProperty ( "voices" )
 print ( voices )

Output:

[ < pyttsx3.voice.Voice object at 0x000001994D817A20 > , < pyttsx3.voice.Voice object at 0x000001994D817F898 > , < pyttsx3.voice.Voice object at 0x000001994D6182D30 > , < pyttsx3.voice.Voice object at 0x000001994E799C10 > , < pyttsx3.voice.Voice object at 0x000001994D48CD90 > ]

As we can see here, that my system has five voice sounds, lets use the fifth one.

For example:

 # to set the voice another voice
 engine_system.setProperty ( "voice" , voices [ 5 ].id )
 engine_system.say ( text_speech )
 engine_system.runAndWait ( )

We can save the audio as the file by using the save_to_file() function if we don’t want to play the audio by using the say() function.

For example:

 # for saving the speech audio into the file
 engine_system.save_to_file ( text , "text_to_Speech.mp3" )
 engine_system.runAndWait ( )

Example for listening the event:

 import pyttsx3
 def onStart ( ):
    print ( ' starting ' )
 def onWord ( name , location , length ) :
    print ( ' word ' , name , location , length )
 def onEnd ( name , completed ) :
    print ( ' finishing ' , name , completed )
 engine = pyttsx3.init ( )
 engine.connect ( ' started-utterance ' , onStart )
 engine.connect ( ' started-word ' , onWord )
 engine.connect ( ' finished-utterance ' , onEnd )
 sen = ' One day the people that don’t even believe in you will tell everyone how they met you '
 engine.say ( sen )
 engine.runAndWait ( )

Output:

 word None 1 559936
 word None 1 559936
 word None 1 559936
 finishing None True
 finishing None True
 finishing None True

Conclusion:

In this article, we have discussed two types of python libraries, gTTS and pyttsx3, used for converting text into speech, one for Online conversion and another one is for offline conversion. We have also discussed their various properties of how a user can change the speaking rate of the speech and how they can change the voice into different voices available on their system.

Python Tutorial

Python Conditional Statements

Python Loops

Python Arrays

Python Strings

Python Built-in Data Structure

Python Functions

Python File Handling

Python Exception Handling

Python OOPs Concept

Python Iterators

Python Generators

Python Decorators

Python Functions and Methods

Python Modules

Python MySQL

Python MongoDB

Python SQLite

Python Data Structure Implementation

Python Advance Topics

Python 2

Python 3

How to

Sorting

Programs

Questions

Differences

Python Kivy

Python Tkinter

Python PyQt5

Misc

Speech Recognition Module in Python