Questions tagged [python-docx]

A python library to create, read and write Microsoft Office Word 2007 docx files.

The docx module creates, reads and writes Microsoft Office Word 2007 docx files.

Including the following features:

Creation:

  • Paragraphs
  • Bullets
  • Numbered lists
  • Document properties (author, company, etc)
  • Multiple levels of headings
  • Tables
  • Section and page breaks
  • Images

Modification:

  • Search and replace
  • Extract plain text of document
  • Add and delete items anywhere within the document
  • Change document properties
  • Run xpath queries against particular locations in the document - useful for retrieving data from user-completed templates.

For detailed information and examples, visit the python-docx documentation.

See also the official GitHub homepage.

1026 questions
7
votes
1 answer

Bullet Lists in python-docx

I am trying to get this to work in python-docx: A bullet list I can get using this: from docx import Document doc = Document() p = doc.add_paragraph() p.style = 'List Bullet' r = p.add_run() r.add_text("Item 1") # Something's gotta come here to…
Vizag
  • 657
  • 1
  • 4
  • 26
7
votes
1 answer

Converting embedded Excel objects from a docx file into images

I am using pandoc (via pypandoc) to convert docx files into markdown, on a non-Windows machine. Those files can contain images, but also other embedded objects. pandoc is actually able to translate embedded Powerpoint presentations (into EMF files),…
fralau
  • 2,040
  • 18
  • 33
7
votes
2 answers

How Can I Write Charts to Python DocX Document

python beginner here with a simple question. Been using Python-Docx to generate some reports in word from Python data (generated from excel sheets). So far so good, but would like to add a couple of charts to the word document based on the data in…
John Rogerson
  • 935
  • 1
  • 12
  • 34
7
votes
3 answers

Extract image position from .docx file using python-docx

I'm trying to get the image index from the .docx file using python-docx library. I'm able to extract the name of the image, image height and width. But not the index where it is in the word file import docx doc = docx.Document(filename) for s in…
argo
  • 4,780
  • 3
  • 21
  • 40
7
votes
0 answers

Extract Header and Table text from a .docx file

I'm trying to extract page and header data from a docx file. The file is several hundred pages, each with a table and a header. The header has pertinent information that needs to be paired with each table. I'm able to extract the header and table…
user2682863
  • 2,680
  • 1
  • 19
  • 34
7
votes
1 answer

How can I set the language in text with python-docx

I create word files using the python-docx library. I want to be able to set different parts of the document to different languages. How can the language be set with python-docx? Preferrably, I would like to do it at the run-level, since I need…
NiklasR
  • 403
  • 3
  • 16
7
votes
1 answer

Finding word on page(s) in document

I am looking for an elegant solution to find on what page(s) in a document a certain word occurs that I have stored in a python dictionary/list. I first considered .docx format as an input and had a look at PythonDocx which has a search function,…
birgit
  • 981
  • 2
  • 17
  • 38
7
votes
4 answers

Python Docx - Sections - Page Orientation

The following code tries to use landscape orientation, but the document is created as potrait. Can you suggest where the problem is? from docx import Document from docx.enum.section import WD_ORIENT document = Document() section =…
Jesse
  • 237
  • 1
  • 3
  • 10
7
votes
1 answer

How to find a list in docx using python?

I'm trying to pull apart a word document that looks like this: 1.0 List item 1.1 List item 1.2 List item 2.0 List item It is stored in docx, and I'm using python-docx to try to parse it. Unfortunately, it loses all the numbering at the start. I'm…
SeanVDH
  • 836
  • 8
  • 28
6
votes
6 answers

Add page number using python-docx

I am trying to add a page number in the footer of a word doc using python-docx. So far, I haven't been able to find how to do so. This question address how to find a page number (or how you cannot). This one talks about creating a template and…
max_max_mir
  • 1,061
  • 1
  • 13
  • 31
6
votes
1 answer

How do I copy the contents of a word document?

I want to write a program that copies text from a Word document and pastes it to another. I'm trying to do that using the python-docx library. I was able to do that with the following code, but it does not copy the bold, italic, underlined nor…
E. Epstein
  • 457
  • 7
  • 21
6
votes
1 answer

docx-python word doc page break

I am trying to add a page break to the middle of a document using the docx-python library. It would appear that when adding a pagebreak, the page break is added to the end of the document. Is there a method to add a page break to a specific…
Robert Bailey
  • 75
  • 1
  • 9
6
votes
1 answer

how to create a dataframe from a table in a word document (.docx) file using pandas

I have a word file (.docx) with table of data, I am trying to create a pandas data frame using that table, I have used docx and pandas module. But I could not create a data frame. from docx import Document document = Document('req.docx') for table…
pyd
  • 4,077
  • 9
  • 23
  • 53
6
votes
4 answers

docx center text in table cells

So I am starting to use pythons docx library. Now, I create a table with multiple rows, and only 2 columns, it looks like this: Now, I would like the text in those cells to be centered horizontally. How can I do this? I've searched through docx API…
minecraftplayer1234
  • 1,697
  • 3
  • 17
  • 47
6
votes
1 answer

Python - Remove header and footer from docx file

I need to remove headers and footers in many docx files. I was currently trying using python-docx library, but it doesn't support header and footer in docx document at this time (work in progress). Is there any way to achieve that in Python? As I…
drjackild
  • 364
  • 3
  • 17
1 2
3
68 69