If you want to extract the text content of a Word file there are a few solutions to do this in Python. Unfortunately most of these solutions have dependencies or need to run an external command in a subprocess or are heavy/complex, using an office suite, etc. I find that the best solution among those in the Stackoverflow page is python-docx. But using it bring two dependencies: python-docx itself and lxml. Installing python-docx is not a big problem. Unfortunately lxml is sometimes hard to install or, at the minimum, requires compilation.
To avoid that, inspired by python-docx, I created a simple function to extract text from .docx files that do not require dependencies, using only the standard library. So it’s easy to incorporate it in any Python project.