Python-pptx is a python library that enables the user to create and update PowerPoint (.pptx) presentations. It is used to create modified PowerPoint presentations from the db content that can be downloaded with the help of clicking on a link in a web application. It eases the process of making a bulk update in presentations files and can also be used to update the process of making a few slides that can be quite tiring if done by hand.
This library is hosted on PyPI and can easily be installed in your system by using the pip installer. Still, certain dependencies are required to be downloaded before using python-pptx. These are:
To install python-pptx in your system, type' pip install python-pptx' this command in the command prompt or the terminal.
Note: When you download python-pptx using the pip installer, any of the missing dependencies will automatically get downloaded.
Working with Presentations
We can only make changes in the existing presentations when using python-pptx; when we update any presentation without any slides, it seems like we are creating a new presentation. When we delete all the slides, the theme, the slide master, and the slide layouts are left. They all play a major role in determining the look of the presentation.
Now let's get started using this library.
Opening the Presentation
It is unnecessary to open an already existing file; one can also open a file without specifying any particular file to open, which is quite easy. Using this method, a new ppt will be created from the default template, and this file is saved into a file named 'first_presentation.ppt'.
from pptx import Presentation
presents = Presentation()
To open an existing presentation
If you wish to perform any action on the previously created presentation, it is necessary to specify the name of the file to open that presentation.
Note: If you specify the same name, that is the name before opening and the name for saving the changes in the file. Python-pptx will overwrite the original files, and the previous data will be lost.
Opening or Saving a file-like presentation
It also allows the user to open or save a presentation from a file-like object. It is used when our presentation is over a network or from a database, and we are denied access to the system. We can open or save such a presentation by passing an open file or StringIO stream object.
Working with Slides
Even when working with PowerPoint presentations, when you wish to add a slide, you first need to determine the slide's layout, and the same is the case with python-pptx. Every slide has a layout of its own.
So, when you create a new slide, you need to specify its layout each time. Let's learn the basics of the slide layout that will be important to add the features to the slide.
It is a template for the slide. It is the definition of the slide that determines the structure and orientation of the slide, anything visible on the slide will also be visible on the slide itself, and any changes that are done to this layout of the slide will also be automatically implemented on the slide. Each slide layout is dependent on the slide master in a similar fashion; if you want to implement any change on the whole presentation, it can be done easily by directly making the change to the slide master rather than making those changes individually on each slide. A presentation can have multiple slide masters, but usually, that's not the case.
In PowerPoint, there are nine slide layouts available: title, title and content, title only, section header, two content, comparison, content with the caption, picture with a caption, and Blank, each having zero or more placeholders. They are pre-formatted and have specified areas to add the title, content, or image on the file. Usually, the slide layout occurs in the following sequence.
- title (presentation title slide)
- Title and Content
- Section Header (sometimes called Segue)
- Two contents (side by side text boxes under the same heading)
- comparison (similar to two contents but has two headings to perform difference)
- Title Only
- Content with Caption
- Picture with Caption
In python-pptx, these layouts are zero-indexed. That is, they are first.slide_layouts through first.slide _layouts is a convention that is usually followed by the themes of the PowerPoint, but it is not necessary to always be true.
Note: If the order of the templates is not the same, you can view them in slide master and determine the index of each layout by just counting down from the top, starting at zero.
You can add a constant value to the slide while creating it. However, it is not necessary to add the constant value, but it is a good practice as while dealing with a large number of slides, you can easily keep track of various slides.
add_slide() method can add a slide in the presentation; it appends a slide at the end of your presentation.
from pptx import Presentation SLD_LAYOUT_TITLE_AND_CONTENT = 1 presents = Presentation() slide_layout = presents.slide_layouts[SLD_LAYOUT_TITLE_AND_CONTENT] slide = presents.slides.add_slide(slide_layout) presents.save('sec_presentation.pptx')
Note: Adding a slide is the only operation that can be performed in this library for now. Other operations like copying, deleting, and moving slides is very difficult and will come after backlog background.
Shapes in Slides
Almost every element inside a slide is considered a shape. The slide background is the only part in a PowerPoint that is not considered a shape while using python-pptx.
Six types of shapes can be placed on a slide using python-pptx:
Auto Shape:They are just regular shapes like rectangles, eclipses, or arrows that can have an outline or be empty or filled. These include a variety of pre-set shapes. Some of them can even contain text inside them or have some special adjustments.
Pictures: Any image inserted in a PowerPoint or any image from the clipart is referred to as a picture in PowerPoint.
Graphic Frame: It is a container that holds tables, charts, art diagrams, and even media clippings. They can only be added to the file when inserting a graphical object.
Group Shape: Using these, we can group various shapes that can be used to select, move, or resize them all at once.
Line/Connector: They are linear shapes used to connect other shapes; they remain connected through these lines or connectors even when the shapes are moved from their initial positions.
Content Part: It can be used to embed foreign XML in your presentation.
Working with AutoShapes
There are a total of 182 auto shapes that can be inserted by the user in the presentation, among which 120 shapes have certain adjustments that enable the user to adjust their shape and size. Many shapes share a set of properties that is common to them all.
The adjustments of the auto shapes in PowerPoint are done in the English Metric Units. So, it is necessary to learn about the EMU before implementing auto shapes in python-pptx. In EMU, the inch is the unit of the length used to determine the dimension of the shape.
from pptx import Presentation from pptx.enum.shapes import MSO_SHAPE from pptx.util import Inches prs = Presentation() title_only_slide_layout = prs.slide_layouts slide = prs.slides.add_slide(title_only_slide_layout) shapes = slide.shapes shapes.title.text = 'Add an AutoShape to your shape' left = Inches(0.93) top = Inches(3.0) width = Inches(1.75) height = Inches(1.0) shape = shapes.add_shape(MSO_SHAPE.PENTAGON, left, top, width, height) shape.text = 'Your level 1' left = left + width - Inches(0.4) width = Inches(2.0) for n in range(2, 6): shape = shapes.add_shape(MSO_SHAPE.CHEVRON, left, top, width, height) shape.text = 'Your level %d' % n left = left + width - Inches(0.4) prs.save('third_presentation.pptx')
It is a pre-formatted container that can be used to add text inside it. Multiple placeholders are provided to the user that can be directly inserted into the presentation. This can speed the presentation development as the user does not have to create the container and can directly work on the content that is to be inserted in the slide.
Note: Placeholders themselves are orthogonal shapes. The auto shapes, pictures, and graphic frames are the various types of shapes that can also be a placeholder.
Types of Placeholders
There are various types of placeholders, which are as follows:
Title, Center Title, Subtitle, Body: these are the placeholder that combines to form the basic layout for any presentation. These placeholders can only include text in them. You can bullet the text that is written inside these placeholders.
Content: This placeholder mostly comprises the body of the slide of your presentation. It allows you to insert a table, chart, SmartArt, picture, clipart, or a media clip in your presentation.
Picture/Clip Art: they both can be used to add an image to the slide. In PowerPoint, when using clip art, the inserted button sends you to the clip art gallery, where you can add any image that you wish to add to your presentation.
Chart, Table, Smart Art: They are used to place high graphic content in your presentation to make your presentation look attractive. It makes the information more presentable and easier to read.
Media Clip: allows the user to add video or sound recordings to the presentation.
Date, Footer, Slide Number: These three placeholders are available on most masters and layouts and can easily be added or implemented directly.
Header: They are not available in the slide master or slide layout and can only be added to the note master or the handout master.
Vertical Body, Object, and Title: They are used for written text vertically. They are used to prepare presentations in Japanese.
As discussed earlier, every placeholder is a shape, and it can be accessed by using the shape property of the slide. The placeholder can be accessed using the IDX value of the placeholder. The integer value points to the slide layout placeholder from which it has inherited its property. It remains the same for all the slides created using the particular layout throughout the life cycle of the slide.
You can get the IDX value for all the placeholder in a presentation like this:
from pptx import Presentation from pptx.util import Inches present = Presentation() slide = present.slides.add_slide(present.slide_layouts) for shape in slide.placeholders: print('%d %s' % (shape.placeholder_format.idx, shape.name))
Inserting Content in Placeholder
It is easier to add content in the placeholder. Certain placeholders have specialized methods to insert the content. As of now, we can add tables, pictures, and charts can be added using such methods. Text inside the body and the title placeholder can be done just like in auto shapes.
Inserting Picture in Picture Placeholder
The picture inserted is stretched proportionately, filling the entire placeholder. This can be avoided if the picture's aspect ratio is equal to that of the placeholder. We can crop the picture using the crop properties on the placeholder.
Now that we have discussed all the necessary elements, we can implement the code to add the text to our presentation.
from pptx import Presentation presents = Presentation() title_slide_layout = presents.slide_layouts slide = presents.slides.add_slide(title_slide_layout) title = slide.shapes.title subtitle = slide.placeholders title.text = "Hy Everyone" subtitle.text = "We have successfully added text in the presentation" presents.save('first_presentation.pptx')
Extracting Content from Presentation
We can also use python to extract the content from the PowerPoint presentation. Now, let us code to extract the above content from the presentation.
from pptx import Presentation # I have stored the path of the string from which I want to read the text in the path variable #path is a raw string, it is easy to store the path in raw strings path=r"C:\Users\hp\Desktop\pythonProject\first_presentation.pptx" presents = Presentation(path) #store_all_text will store all the strings whenever a new word will be encountered the list will be added with a new value store_all_text =  for slide in presents.slides: for shape in slide.shapes: if not shape.has_text_frame: continue for paragraph in shape.text_frame.paragraphs: for run in paragraph.runs: store_all_text.append(run.text) print(store_all_text)
Adding Table to Your Presentation
We can implement a table in the presentation using python-pptx. To insert the table, one must understand the following terms:
Table: It is an arrangement of data in rows and columns. It is a convenient way to store and display information.
Cell: It is the basic unit of the table used to store the data in the table. It is the container that holds all the values. In PowerPoint, we can only add text to these containers. As it does not support storing pictures and videos in it.
Row: It is the collection of adjacent horizontal cells. The cells share the same upper and lower boundary. It extends from the rightmost to the leftmost cell in the table at a particular level.
Column: It is the collection of adjacent vertical cells. The cells share the same right and left boundary.
Merged Cell: The combination of adjacent horizontal or vertical or both cells; the combined cell works as a single cell.
from pptx import Presentation from pptx.util import Inches presents = Presentation() title_only_slide_layout = presents.slide_layouts slide = presents.slides.add_slide(title_only_slide_layout) shapes = slide.shapes shapes.title.text = 'Insert a table to your presentation' rows = cols = 2 left = top = Inches(2.0) width = Inches(6.0) height = Inches(1.0) table = shapes.add_table(rows, cols, left, top, width, height).table # set the width of each column here table.columns.width = Inches(2.0) table.columns.width = Inches(4.0) # Add the heading for the columns table.cell(0, 0).text = 'First Text' table.cell(0, 1).text = 'Second Text' # Add the text that you want to add in your table cells table.cell(1, 0).text = 'Let us add text here' table.cell(1, 1).text = 'More text goes here' presents.save('table.pptx')