R tm封装/qdap输出中出现特殊字符
我试图使用以下数据集在R中创建一个术语文档矩阵R tm封装/qdap输出中出现特殊字符,r,special-characters,tm,qdap,R,Special Characters,Tm,Qdap,我试图使用以下数据集在R中创建一个术语文档矩阵 EmailSubject Buy the stunning new phone The game changer is here. Experience a phone ahead of its time. Thank You Chennai Limited Period offer Valentines day special Buy a phone at 10000 and get a new sim free Limited Period
EmailSubject
Buy the stunning new phone
The game changer is here.
Experience a phone ahead of its time.
Thank You Chennai
Limited Period offer
Valentines day special
Buy a phone at 10000 and get a new sim free
Limited Period offer
Valentines day special
Buy a phone at 10000 and get a new sim free
Buy the stunning new phone
The game changer is here.
Experience a phone ahead of its time.
Thank You Chennai
Limited Period offer
Valentines day special
Buy a phone at 10000 and get a new sim free
Thank You Chennai
Limited Period offer
Valentines day special
Buy a phone at 10000 and get a new sim free
Buy a phone at 10000 and get a new sim free
Buy the stunning new phone
The game changer is here.
提前体验手机。
谢谢你,钦奈
限时要约
我使用了qdap和freq_术语。以下是预期的输出
freq_terms(DF)
Expected Output Frequency
Buy 4
Get 5
a 7
thank 12
Stunning 6
The 7
New 10
Valentines 4
phone 7
以下特殊字符经常出现,导致数据不合适
valentinea€™s, a€™s instead of valentines, as. I have tried the same with tm package also.
我用gsub替换了这些字符,但效果不是很好。有人能建议一种方法吗?听起来像是编码问题。问题听起来像编码问题