Brew和knit一份PDF报告，按带有特殊字符的变量拆分（xE5；æ；ø；）-编码问题_R_Character Encoding_Special Characters_Knitr

Brew和knit一份PDF报告，按带有特殊字符的变量拆分（xE5；æ；ø；）-编码问题

r character-encoding

Brew和knit一份PDF报告，按带有特殊字符的变量拆分（xE5；æ；ø；）-编码问题,r,character-encoding,special-characters,knitr,R,Character Encoding,Special Characters,Knitr,我尝试使用brew和knitr生成一份基于分组变量的分区PDF报告。我的分组变量可能包含特殊字符（umlauts），如åæø 使用\usepackage[utf8]{inputenc}可以很好地处理文档标题中的UMLAUT（请参见下面的示例）。但是，分组变量中的umlauts会生成一个错误，错误为\usepackage[utf8]{inputenc} 另一方面，当我尝试\usepackage[T1]{fontenc}时，分组变量中的umlauts被正确处理。但是现在标题没有正确编码我正在努力在

我尝试使用

brew

和

knitr

生成一份基于分组变量的分区PDF报告。我的分组变量可能包含特殊字符（umlauts），如åæø

使用

\usepackage[utf8]{inputenc}

可以很好地处理文档标题中的UMLAUT（请参见下面的示例）。但是，分组变量中的umlauts会生成一个错误，错误为

\usepackage[utf8]{inputenc}

另一方面，当我尝试

\usepackage[T1]{fontenc}

时，分组变量中的umlauts被正确处理。但是现在标题没有正确编码

我正在努力在title和grouping变量中获得正确的编码

下面是一个示例，我尝试生成一份PDF报告，其中包含iris数据集中每个物种的汇总统计数据。我希望它能说明我的问题

R代码，用于在不使用umlauts的情况下准备数据在内置的

iris

数据集中为每个物种创建汇总表。首先，使用原始的

物种

名称，不使用umlauts。Umlaut仅位于文档

\title

中（请参见

.rnw

模板文件的代码）。将摘要表存储在列表中

 data(iris)
 iris_tbl <- dlply(.data = iris, .variables = .(Species), function(x) xtable(summary(x)))

.rnw模板文件的代码在我的示例中，我为以下代码命名了模板文件

iris\u umlaut\u tbl.rnw

。此文件用作R脚本中的

brew\u knit\u pdf

函数的输入

\documentclass{article}

% \usepackage[T1]{fontenc}    
\usepackage[utf8]{inputenc}

\usepackage{geometry}
\geometry{tmargin=2.5cm,bmargin=2.5cm,lmargin=2.5cm,rmargin=2.5cm}

\begin{document}

\begin{titlepage}

\title{Using brew and knitr to produce one PDF report split by a grouping variable.\\Problem with å æ ø in grouping variable}

\clearpage\maketitle
\thispagestyle{empty}

\tableofcontents

\end{titlepage}
\newpage


\section{Summary statistics for each species}

% R code loop wrapped in brew syntax, which brews the template file xxx.rnw to a new .rnw file xxx_out.rnw, which has one section for each group that is looped over, i.e. the names of the list iris_tbl produced in the R script.

<% for (Sp in names(iris_tbl)) { -%>

\subsection{<%= Sp %>}
<<sum-<%= Sp %>, echo=FALSE, results='asis'>>=
print(iris_tbl[["<%= Sp %>"]])
@
\newpage
<% } %>

\end{document}

更新：

非常感谢brew软件包维护人员Jeffrey Horner提供的“关闭SO”输入。在使用Ubuntu和命令行R运行我的脚本时，他没有编码问题。这给了我一些新的希望。我自己没有机会运行Ubuntu，但今天我更新了RStudio（0.97.449），并将默认编码设置为ISO8859-1（谢谢易慧！）。现在，使用.rnw文件中的

\usepackage[latin1]{inputenc}

在标题和分组变量中正确编码了特殊字符。而且

\usepackage[ansinew]{inputenc}

也可以工作。我不确定我最初的尝试出了什么问题。当我重新打开脚本文件时，RStudio可能没有将Options中的默认编码集应用到脚本文件中，我根据Yihui的建议对其进行了更改。但这只是一种猜测

由于您使用的是

UTF-8

，这不是操作系统的本机编码，因此需要明确告诉

knitr

输入文档的编码。例如，你必须打电话

knit2pdf(brew_out, encoding = "UTF-8")

但我不确定

brew

是否能处理非本机字符编码。如果不是，我建议您使用系统默认编码（在本例中应为

ISO8859-1

），并且

如果必须使用

UTF-8

，也可以在

knitr

中执行所有操作（这也使您能够单击按钮编译文档）；请参见示例。

非常感谢您的快速回答。我在R脚本中尝试了knit2pdf（brew_out，encoding=“UTF-8”），并在模板中尝试了\usepack[utf8]{inputenc}或\usepack[T1]{fontenc}。我的注释的rnw文件部分丢失。我再试一次：@Yihui:谢谢你的快速回答。我在R脚本中尝试了（1）knit2pdf（brew_out，encoding=“UTF-8”），在template.rnw文件中尝试了\usepackage[utf8]{inputenc}或\usepackage[T1]{fontenc}，在.rnw文件中尝试了（2）knit2pdf（brew_out，…）和\usepackage[latin9]{inputenc}，但问题仍然存在。因此，这可能是一个brew问题。谢谢你给我指出编织扩展的例子。@Yihui您的075编织扩展建议的后续问题。我只想用循环变量来命名这些小节。在模板文件中，小节以“Now i is”开头。如何将brew版本（\subsection{}）转换为knitr？我天真的尝试没有成功。提前谢谢。@Henrik use

\subsection{{{{i}}}

非常感谢。我意识到brew版本中也有空格，但我认为它们唯一的功能是提高代码的可读性，而不是让代码运行实际上需要空格。很明显，我缺乏一些基本技能（乳胶？针织？两者都有？）。谢谢你的帮助，一辉！

\documentclass{article}

% \usepackage[T1]{fontenc}    
\usepackage[utf8]{inputenc}

\usepackage{geometry}
\geometry{tmargin=2.5cm,bmargin=2.5cm,lmargin=2.5cm,rmargin=2.5cm}

\begin{document}

\begin{titlepage}

\title{Using brew and knitr to produce one PDF report split by a grouping variable.\\Problem with å æ ø in grouping variable}

\clearpage\maketitle
\thispagestyle{empty}

\tableofcontents

\end{titlepage}
\newpage


\section{Summary statistics for each species}

% R code loop wrapped in brew syntax, which brews the template file xxx.rnw to a new .rnw file xxx_out.rnw, which has one section for each group that is looped over, i.e. the names of the list iris_tbl produced in the R script.

<% for (Sp in names(iris_tbl)) { -%>

\subsection{<%= Sp %>}
<<sum-<%= Sp %>, echo=FALSE, results='asis'>>=
print(iris_tbl[["<%= Sp %>"]])
@
\newpage
<% } %>

\end{document}

data(iris)
iris$Species <- as.character(iris$Species)

iris$Species[iris$Species == "setosa"] <- "åsetosa"
iris$Species[iris$Species == "versicolor"] <- "æversicolor"
iris$Species[iris$Species == "virginica"] <- "øvirginica"

# create a summary table for each species
iris_tbl <- dlply(.data = iris, .variables = .(Species), function(x) xtable(summary(x)))

> sessionInfo()

R version 3.0.0 (2013-04-03)
Platform: x86_64-w64-mingw32/x64 (64-bit)

locale:
[1] LC_COLLATE=Norwegian (Bokmål)_Norway.1252  LC_CTYPE=Norwegian (Bokmål)_Norway.1252   
[3] LC_MONETARY=Norwegian (Bokmål)_Norway.1252 LC_NUMERIC=C                              
[5] LC_TIME=Norwegian (Bokmål)_Norway.1252    

attached base packages:
[1] splines   stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] Hmisc_3.10-1               survival_2.37-4            pastecs_1.3-13             boot_1.3-9                
 [5] pspline_1.0-15             ggplot2_0.9.3.1            lubridate_1.2.0            stringr_0.6.2             
 [9] brew_1.0-6                 knitr_1.1                  xtable_1.7-1               plyr_1.8                  
[13] PerformanceAnalytics_1.1.0 xts_0.9-3                  zoo_1.7-9                  gdata_2.12.0.2            

loaded via a namespace (and not attached):
 [1] cluster_1.14.4     colorspace_1.2-2   dichromat_2.0-0    digest_0.6.3       evaluate_0.4.3     formatR_0.7       
 [7] grid_3.0.0         gtable_0.1.2       gtools_2.7.1       labeling_0.1       lattice_0.20-15    MASS_7.3-26       
[13] memoise_0.1        munsell_0.4        proto_0.3-10       RColorBrewer_1.0-5 reshape2_1.2.2     scales_0.2.3      
[19] tools_3.0.0

> getOption("encoding")

[1] "native.enc"

knit2pdf(brew_out, encoding = "UTF-8")

\usepackage[latin9]{inputenc}