library(officer)
# Package `magrittr` makes officer usage easier.
library(magrittr)
Use the function read_docx
to create an r object representing a Word document. Initial Word file can be specified with argument path
. If none is provided, this file will be an empty document located in the package directory. Formats and styles will be those available in the initial file.
my_doc <- read_docx()
# display styles
styles_info(my_doc)
## # A tibble: 21 × 5
## style_type style_id style_name is_custom is_default
## <chr> <chr> <chr> <lgl> <lgl>
## 1 paragraph Normal Normal FALSE TRUE
## 2 paragraph Titre1 heading 1 FALSE FALSE
## 3 paragraph Titre2 heading 2 FALSE FALSE
## 4 paragraph Titre3 heading 3 FALSE FALSE
## 5 character Policepardfaut Default Paragraph Font FALSE TRUE
## 6 table TableauNormal Normal Table FALSE TRUE
## 7 numbering Aucuneliste No List FALSE TRUE
## 8 character strong strong TRUE FALSE
## 9 paragraph centered centered TRUE FALSE
## 10 table tabletemplate table_template TRUE FALSE
## # ... with 11 more rows
Let’s add an image in the document first.
src <- tempfile(fileext = ".png")
png(filename = src, width = 5, height = 6, units = 'in', res = 300)
barplot(1:10, col = 1:10)
dev.off()
## quartz_off_screen
## 2
my_doc <- my_doc %>%
body_add_img(src = src, width = 5, height = 6, style = "centered")
Then some paragraphs.
my_doc <- my_doc %>%
body_add_par("Hello world!", style = "Normal") %>%
body_add_par("", style = "Normal") # blank paragraph
And a table.
my_doc <- my_doc %>%
body_add_table(iris, style = "table_template")
File can be generated using function print
and argument target
:
print(my_doc, target = "assets/docx/first_example.docx") %>%
invisible()
Download file first_example.docx - view with office web viewer
To add paragraphs, tables, images or other elements into the document, you will have to use functions starting with body_add_
:
A cursor is available and can be manipulated so that content can be added regarding to its position:
before
will insert a new element before the selected element in the document.after
will insert a new element after the selected element in the document.on
will replace the selected element in the document by a new element.Cursor functions are the following:
In order to illustrate cursor functions, a document made of several paragraphs will be used (let’s use officer for that).
read_docx() %>%
body_add_par("paragraph 1", style = "Normal") %>%
body_add_par("paragraph 2", style = "Normal") %>%
body_add_par("paragraph 3", style = "Normal") %>%
body_add_par("paragraph 4", style = "Normal") %>%
body_add_par("paragraph 5", style = "Normal") %>%
body_add_par("paragraph 6", style = "Normal") %>%
body_add_par("paragraph 7", style = "Normal") %>%
print(target = "assets/docx/init_doc.docx" ) %>%
invisible()
Download file init_doc.docx - view with office web viewer
Now, let’s use init_doc.docx
with read_docx
and manipulate its content with cursor functions.
doc <- read_docx(path = "assets/docx/init_doc.docx") %>%
# default template contains only an empty paragraph
# Using cursor_begin and body_remove, we can delete it
cursor_begin() %>% body_remove() %>%
# Let add text at the beginning of the
# paragraph containing text "paragraph 4"
cursor_reach(keyword = "paragraph 4") %>%
slip_in_text("This is ", pos = "before", style = "Default Paragraph Font") %>%
# move the cursor forward and end a section
cursor_forward() %>%
body_add_par("The section stop here", style = "Normal") %>%
body_end_section(landscape = TRUE) %>%
# move the cursor at the end of the document
cursor_end() %>%
body_add_par("The document ends now", style = "Normal")
print(doc, target = "assets/docx/cursor.docx") %>%
invisible()
Download file cursor.docx - view with office web viewer
Text and images can be inserted at the beginning or the end of the selected paragraph (selection made by the cursor). Available functions are the following:
library(magrittr)
img.file <- file.path( Sys.getenv("R_HOME"), "doc", "html", "logo.jpg" )
read_docx() %>%
body_add_par("R logo: ", style = "Normal") %>%
slip_in_img(src = img.file, style = "strong", width = .3, height = .3) %>%
slip_in_text("This is official ", style = "strong", pos = "before") %>%
slip_in_text(" that can be found here: ", style = "strong", pos = "after") %>%
slip_in_text(img.file, style = "strong", pos = "after") %>%
print(target = "assets/docx/slip_in_demo.docx") %>%
invisible()
Download file slip_in_demo.docx - view with office web viewer
The function body_remove
let remove content from a Word document. This function used with cursor_*
functions is a convenient tool to update an existing document.
For illustration purpose, we will generate a document that will be used as initial document later when showing how to use body_remove
.
library(officer)
library(magrittr)
str1 <- "Lorem ipsum dolor sit amet, consectetur adipiscing elit. " %>%
rep(20) %>% paste(collapse = "")
str2 <- "Drop that text"
str3 <- "Aenean venenatis varius elit et fermentum vivamus vehicula. " %>%
rep(20) %>% paste(collapse = "")
my_doc <- read_docx() %>%
body_add_par(value = str1, style = "Normal") %>%
body_add_par(value = str2, style = "centered") %>%
body_add_par(value = str3, style = "Normal")
print(my_doc, target = "assets/docx/ipsum_doc.docx") %>% invisible()
File ipsum_doc.docx
now exists and contains a paragraph containing text that text. In the following example, we will position the cursor on that paragraph and then delete it:
my_doc <- read_docx(path = "assets/docx/ipsum_doc.docx") %>%
cursor_reach(keyword = "that text") %>%
body_remove()
print(my_doc, target = "assets/docx/ipsum_doc.docx") %>% invisible()
The text search is made via xpath 1.0
and regular expressions are not supported.
Download file ipsum_doc.docx - view with office web viewer
Sections can be added to a document. This is possible by using function body_end_section
.
A section start at the end of the previous section (or the beginning of the document if no preceding section exists), it stops where the section is declared. The function is reflecting that (complicated) Word concept, by adding an ending section attached to the paragraph where cursor is.
str1 <- "Lorem ipsum dolor sit amet, consectetur adipiscing elit. " %>%
rep(30) %>% paste(collapse = "")
str2 <- "Aenean venenatis varius elit et fermentum vivamus vehicula. " %>%
rep(30) %>% paste(collapse = "")
my_doc <- read_docx() %>%
slip_in_text(str = str1, style = "strong") %>%
body_add_par(value = str2, style = "centered") %>%
body_end_section(landscape = TRUE, colwidths = c(.6, .4), space = .05, sep = FALSE) %>%
body_add_par(value = str3, style = "Normal")
print(my_doc, target = "assets/docx/section.docx") %>% invisible()
Download file section.docx - view with office web viewer
In the previous example, first two paragraphs will be in a 2 columns section and the third will be in a default section.
Man can combine slip_in_seqfield
and slip_in_text
to prefix a paragraph with references (i.e. chapter number and graphic index in the document). However, producing a plot or a table and its caption can be verbose.
Shortcuts functions are implemented in the object shortcuts
(it will at least give you a template of code to modify if it does not fit exactly your need). slip_in_tableref
, slip_in_plotref
and body_add_gg
can make life easier.
Below an illustration, the code is linear and easy to read:
library(magrittr)
library(officer)
library(ggplot2)
gg1 <- ggplot(data = iris, aes(Sepal.Length, Petal.Length)) + geom_point()
gg2 <- ggplot(data = iris, aes(Sepal.Length, Petal.Length, color = Species)) + geom_point()
doc <- read_docx() %>%
body_add_par(value = "Table of content", style = "heading 1") %>%
body_add_toc(level = 2) %>%
body_add_par(value = "Tables", style = "heading 1") %>%
body_add_par(value = "dataset mtcars", style = "heading 2") %>%
body_add_table(value = head(mtcars)[, 1:4], style = "table_template" ) %>%
body_add_par(value = "data mtcars", style = "table title") %>%
shortcuts$slip_in_tableref(depth = 2) %>%
body_add_par(value = "dataset iris", style = "heading 2") %>%
body_add_table(value = head(iris), style = "table_template" ) %>%
body_add_par(value = "data iris", style = "table title") %>%
shortcuts$slip_in_tableref(depth = 2) %>%
body_end_section(landscape = FALSE ) %>%
body_add_par(value = "plot examples", style = "heading 1") %>%
body_add_gg(value = gg1, style = "centered" ) %>%
body_add_par(value = "graph example 1", style = "graphic title") %>%
shortcuts$slip_in_plotref(depth = 1) %>%
body_add_par(value = "plot 2", style = "heading 2") %>%
body_add_gg(value = gg2, style = "centered" ) %>%
body_add_par(value = "graph example 2", style = "graphic title") %>%
shortcuts$slip_in_plotref(depth = 2) %>%
body_end_section(landscape = TRUE) %>%
body_add_par(value = "Table of tables", style = "heading 2") %>%
body_add_toc(style = "table title") %>%
body_add_par(value = "Table of graphics", style = "heading 2") %>%
body_add_toc(style = "graphic title")
print(doc, target = "assets/docx/toc_and_captions.docx") %>% invisible()
Download file toc_and_captions.docx - view with office web viewer