如何为远程沙盒执行创建嵌入式python模块？_Python_Function_Pickle

如何为远程沙盒执行创建嵌入式python模块？

python function

如何为远程沙盒执行创建嵌入式python模块？,python,function,pickle,Python,Function,Pickle,我试图动态地将python代码添加到沙箱模块中，以便在远程机器上执行。我遇到了如何处理导入方法的问题。例如，常见的脚本编写方式如下： from test_module import g import other_module def f(): g() other_module.z() 我知道我可以用g和潜在的z来pickle f，但是如何保留z的“other_module”范围呢？如果我把f和g都放在沙箱中，那么在调用f时z将不能正确解析。是否可以使用某种类型的嵌入

我试图动态地将python代码添加到沙箱模块中，以便在远程机器上执行。我遇到了如何处理导入方法的问题。例如，常见的脚本编写方式如下：

 from test_module import g
 import other_module

 def f():
     g()
     other_module.z()

我知道我可以用g和潜在的z来pickle f，但是如何保留z的“other_module”范围呢？如果我把f和g都放在沙箱中，那么在调用f时z将不能正确解析。是否可以使用某种类型的嵌入式模块来正确解析z，即沙盒、其他_模块

我将远程代码加载到沙箱的目的是不污染全局名称空间。例如，如果使用另一个远程方法自己的依赖关系图调用它，那么它不应该干扰另一组远程代码。期望python在沙箱模块进入和退出使用时保持稳定是否现实？我之所以这样说是因为这篇文章：

这让我觉得在这种情况下，删除诸如不同沙盒之类的模块可能会有问题。

其他模块可以通过

或：

如果您从其他模块或其他沙盒模块中调用“sandbox”模块，并且希望稍后重新加载一些新代码，则只导入一个模块，而不是像“从沙盒导入f”那样从中导入名称，并调用“sandbox.f”而不是“f”更容易。那么重新加载就很容易了。（但NaturareReload命令对其没有用处）

班级

>>> class A(object): pass
... 
>>> a = A()
>>> A.f = lambda self, x: 2 * x  # or a pickled function
>>> a.f(1)
2
>>> A.f = lambda self, x: 3 * x
>>> a.f(1)
3

重新加载方法似乎很容易。我记得，重新加载在修改过的源代码中定义的类可能会很复杂，因为旧的类代码可以由某个实例保存。在最坏的情况下，可以/需要单独更新实例的代码：

    some_instance.__class__ = sandbox.SomeClass  # that means the same reloaded class

我将后者与通过win32com自动化访问的python服务一起使用，重新加载类代码成功，没有丢失实例数据

我目前使用的方法是同时启用“导入x”和“从x导入y”依赖项绑定。当前实现的一个缺点是，它在使用的每个模块中创建方法的副本，这与代码源不同，在代码源中，每次使用都只是对内存中相同方法的引用（尽管我这里有冲突的结果-请参见代码后面的部分）

///analysis_script.py//（为简洁起见，排除了依赖项）

///driver.py///

import modutil
import analysis_script

modutil.serialize_module_with_dependencies(analysis_script)

///modutil.py///

import sys
import modulefinder
import os
import inspect
import marshal

def dump_module(funcfile, name, module):
    functions_list = [o for o in inspect.getmembers(module) if inspect.isfunction(o[1])]
    print 'module name:' + name
    marshal.dump(name, funcfile)
    for func in functions_list:
       print func
       marshal.dump(func[1].func_code, funcfile)

def serialize_module_with_dependencies(module):

    python_path = os.environ['PYTHONPATH'].split(os.pathsep)
    module_path = os.path.dirname(module.__file__)

    #planning to search for modules only on this python path and under the current scripts working directory
    #standard libraries should be expected to be installed on the target platform
    search_dir = [python_path, module_path]

    mf = modulefinder.ModuleFinder(search_dir)

    #__file__ returns the pyc after first run
    #in this case we use replace to get the py file since we need that for our call to       mf.run_script
    src_file = module.__file__
    if '.pyc' in src_file:
        src_file = src_file.replace('.pyc', '.py')

    mf.run_script(src_file)

    funcfile = open("functions.pickle", "wb")

    dump_module(funcfile, 'sandbox', module)

    for name, mod in mf.modules.iteritems():
        #the sys module is included by default but has no file and we don't want it anyway, i.e. should
        #be on the remote systems path. __main__ we also don't want since it should be virtual empty and
        #just used to invoke this function.
        if not name == 'sys' and not name == '__main__':
            dump_module(funcfile, name, sys.modules[name])

    funcfile.close()

///sandbox_reader.py///

import marshal
import types
import imp

sandbox_module = imp.new_module('sandbox')

dynamic_modules = {}
current_module = ''
with open("functions.pickle", "rb") as funcfile:
    while True:
        try:
            code = marshal.load(funcfile)
        except EOFError:
             break

        if isinstance(code,types.StringType):
            print "module name:" + code
            if code == 'sandbox':
                current_module = "sandbox"
            else:
                current_module = imp.new_module(code)
                dynamic_modules[code] = current_module
                exec 'import '+code in sandbox_module.__dict__
        elif isinstance(code,types.CodeType):
            print "func"
            if current_module == "sandbox":
                func = types.FunctionType(code, sandbox_module.__dict__, code.co_name)
                setattr(sandbox_module, code.co_name, func)
            else:
                func = types.FunctionType(code, current_module.__dict__, code.co_name)
                setattr(current_module, code.co_name, func)
        else:
            raise Exception( "unknown type received")

#yaa! actually invoke the method
sandbox_module.f()
del sandbox_module

例如，函数图在序列化之前如下所示：

 module name:sandbox
 ('f', <function f at 0x15e07d0>)
 ('z', <function z at 0x7f47d719ade8>)
 module name:test_module
 ('g', <function g at 0x15e0758>)
 ('z', <function z at 0x7f47d719ade8>)
 module name:third_level_module
 ('z', <function z at 0x7f47d719ade8>)

模块名称：沙箱
（‘f’，）
（‘z’，）
模块名称：测试单元模块
（‘g’，）
（‘z’，）
模块名称：第三级模块
（‘z’，）

具体来说，查看函数z，我们可以看到所有引用都指向同一地址，即0x7f47d719ade8

在沙箱重建后的远程过程中，我们有：

 print sandbox_module.z 
 <function z at 0x1a071b8>
 print sandbox_module.third_level_module.z 
 <function z at 0x1a072a8>
 print sandbox_module.test_module.z 
 <function z at 0x1a072a8>

print sandbox_module.z
打印沙盒模块。第三级模块。z
打印沙盒模块.test模块.z

这让我大吃一惊！我原以为重建后这里的所有地址都是唯一的，但出于某种原因，sandbox_module.test_module.z和sandbox_module.third_level_module.z具有相同的地址？

您可能不希望序列化从Python库导入的函数，例如数学函数或混合了Python+C的大型软件包，但您的代码将其序列化。它可能会导致不必要的问题，例如它们没有func_代码属性等
您不需要重复序列化以前已序列化的函数。您可以发送全名并根据此信息导入它们。这就是为什么你在记忆中多次出现它的原因
原始格式
```
…
```
不够通用。函数可以在本地计算机上通过“…导入…作为…”子句以不同的名称导入。您可以序列化混合字符串和代码对象的元组列表

暗示

现在，沙盒和其他序列化模块之间并没有什么重要区别。“if sandbox”的条件很快就会被删除。

这是一个问题，我认为它不会创建更多副本，而是会创建更多对该函数的引用。但是，在删除或替换所有引用之前，不会从内存中删除原始函数。这就是为什么我更喜欢调用“some_mod.f”而不是调用“f”。如果我从普通模块导入名称，我还必须单独更新对重新加载对象的名称引用。在内存使用方面，您发现正常和“沙盒”有什么不同？顺便问一下，你认为我最初对酸洗的回答对你有用还是你不确定？（按钮）你的两个答案都很有用！此线程的两个和复选标记均为+1。添加了有关函数重构引用混淆的详细信息。此处发布的此方法将所有导入转储到沙盒模块中。这不好。正确的方法是按照每个模块在发送过程中的存在方式重新构造导入。这很有效。这正是我需要做的！谢谢

import marshal
import types
import imp

sandbox_module = imp.new_module('sandbox')

dynamic_modules = {}
current_module = ''
with open("functions.pickle", "rb") as funcfile:
    while True:
        try:
            code = marshal.load(funcfile)
        except EOFError:
             break

        if isinstance(code,types.StringType):
            print "module name:" + code
            if code == 'sandbox':
                current_module = "sandbox"
            else:
                current_module = imp.new_module(code)
                dynamic_modules[code] = current_module
                exec 'import '+code in sandbox_module.__dict__
        elif isinstance(code,types.CodeType):
            print "func"
            if current_module == "sandbox":
                func = types.FunctionType(code, sandbox_module.__dict__, code.co_name)
                setattr(sandbox_module, code.co_name, func)
            else:
                func = types.FunctionType(code, current_module.__dict__, code.co_name)
                setattr(current_module, code.co_name, func)
        else:
            raise Exception( "unknown type received")

#yaa! actually invoke the method
sandbox_module.f()
del sandbox_module

 module name:sandbox
 ('f', <function f at 0x15e07d0>)
 ('z', <function z at 0x7f47d719ade8>)
 module name:test_module
 ('g', <function g at 0x15e0758>)
 ('z', <function z at 0x7f47d719ade8>)
 module name:third_level_module
 ('z', <function z at 0x7f47d719ade8>)

 print sandbox_module.z 
 <function z at 0x1a071b8>
 print sandbox_module.third_level_module.z 
 <function z at 0x1a072a8>
 print sandbox_module.test_module.z 
 <function z at 0x1a072a8>

def some_filter(module_name):
    mod_path = sys.modules[module_name].__file__
    # or if getattr(sys.modules[module_name], 'some_my_attr', None)
    return not mod_path.startswith('/usr/lib/python2.7/')

dumped_funcs = {}

def dump_module(...
    ...
    data = []
    for func_name, func_obj in functions_list:
        if some_filter(func_obj.__module__) and not func_obj in dumped_funcs and \
                    hasattr(func_obj, 'func_code'):
            data.append((func_name, func_obj.func_code))
            dumped_funcs[func_obj] = True  # maybe later will be saved package.mod.fname
        else:
            data.append((func_name, '%s.%s' % (func_obj.__module__, \
                                               func_obj.func_code.co_name)))
    marshal.dump(data, funcfile)