Javascript 使用转换后的JS函数在Python中进行除臭

Javascript 使用转换后的JS函数在Python中进行除臭,javascript,python,cryptography,web-scraping,obfuscation,Javascript,Python,Cryptography,Web Scraping,Obfuscation,我需要将以下函数转换为python,以对web抓取时提取的文本进行除臭: function obfuscateText(coded, key) { // Email obfuscator script 2.1 by Tim Williams, University of Arizona // Random encryption key feature by Andrew Moulden, Site Engineering Ltd // This code is freeware provided

我需要将以下函数转换为python,以对web抓取时提取的文本进行除臭:

function obfuscateText(coded, key) {
// Email obfuscator script 2.1 by Tim Williams, University of Arizona
// Random encryption key feature by Andrew Moulden, Site Engineering Ltd
// This code is freeware provided these four comment lines remain intact
// A wizard to generate this code is at http://www.jottings.com/obfuscator/
shift = coded.length
link = ""
for (i = 0; i < coded.length; i++) {
    if (key.indexOf(coded.charAt(i)) == -1) {
        ltr = coded.charAt(i)
        link += (ltr)
    }
    else {
        ltr = (key.indexOf(coded.charAt(i)) - shift + key.length) % key.length
        link += (key.charAt(ltr))
    }
}
document.write("<a href='mailto:" + link + "'>" + link + "</a>")
打印模糊文本(“uw#287u#Guw#287Xw8Iwu!#W7L#“,“wxyvzabuckdtefgshirjklqmnoppqorstnuvmwxylz01k23j456i789h@G!#$F%&E'*+D-/=C?^B `{A}}”

actionattraction$comcastWnet

但我得到的是一个稍微不正确的输出,而不是actionattraction@comcast.net我在上面。同样,上述代码多次为同一html页面提供随机字符

目标html页面在JS中有一个带有编码和键的模糊文本函数,我在obsfunc中提取函数签名并动态执行:

email=eval(obsfunc)
它将电子邮件存储在上述变量中,但问题是它大部分时间都工作,但某些时候会失败,我强烈认为问题在于提供给python函数的参数,它们可能需要转义或转换,因为它包含特殊字符?我尝试传递原始参数和不同的类型,如repr(),但问题仍然存在

一些例子actionattraction@comcast.net使用相同的python函数计算错误和正确(第一行是电子邮件):


首先,
index
不会返回
None
,而是抛出异常。在您的例子中,W出现而不是点,因为返回的索引是
0
,并且
not inkey
(这也是错误的)错误地认为键中没有字符

第二,存在
&
建议您确实需要查找和解码HTML实体

最后,我建议像这样重写它

len0 = len(code)
len1 = len(key)
link = ''
for ch in code:
    try:
        ch = key[(key.index(ch) - len0 + len1) % len1]
    except ValueError: pass
    link += ch
return link

我已经重写了除臭剂:

def deobfuscate_text(coded, key):
    offset = (len(key) - len(coded)) % len(key)
    shifted_key = key[offset:] + key[:offset]
    lookup = dict(zip(key, shifted_key))
    return "".join(lookup.get(ch, ch) for ch in coded)
并将其测试为

tests = [
    ("KMd%Y@Kdd8KMd%Y@IMY!MKcdJ@*d", "utvsrwqxpyonzm0l1k2ji3h4g5fe6d7c8b9aZ.Y@X!WV#U$T%S&RQ'P*O+NM-L/K=J?IH^G_F`ED{C|B}A~"),
    ("}ARGML}RRP}ARGMLjAMKA}QRiLCR", "}|{`_^?=/-+*'&%$#!@.9876543210zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA~"),
    ("DEWLRQDWWUDEWLRQoERPEDVWnQHW", "%&$#!@.'9876*54321+0zyxw-vutsr/qponm=lkjih?gfed^cbaZY_XWVUT`SRQPO{NMLKJ|IHGFE}DCBA~"),
    ("ZUhq4uh@e4Om.04O", "ksYSozqUyFOx9uKvQa2P4lEBhMRGC8g6jZXiDwV5eJcAp7rIHL31bnTWmN0dft")
]

for coded,key in tests:
    print(deobfuscate_text(coded, key))

actionattraction@comcast.net
actionattraction@comcast.net
actionattraction@comcast.net
anybody@home.com
请注意,所有三个键字符串都包含
&;将其替换为
&
可修复此问题。可能在某个时候javascript被错误地转义了html代码;Python有一个模块,该模块将取消html特殊字符的编码,如下所示:

# Python 2.x:
import HTMLParser
html_parser = HTMLParser.HTMLParser()
unescaped = html_parser.unescape(my_string)

# Python 3.x:
import html.parser
html_parser = html.parser.HTMLParser()
unescaped = html_parser.unescape(my_string)
actionattraction@comcast.net
actionattraction@comcast.net
actionattraction@comcast.net
anybody@home.com
# Python 2.x:
import HTMLParser
html_parser = HTMLParser.HTMLParser()
unescaped = html_parser.unescape(my_string)

# Python 3.x:
import html.parser
html_parser = html.parser.HTMLParser()
unescaped = html_parser.unescape(my_string)