Python 用移动窗口的正态分布替换NAN

Python 用移动窗口的正态分布替换NAN,python,numpy,random,nan,Python,Numpy,Random,Nan,我需要用numpy中的局部正态分布替换一维阵列的NAN。我选择一个窗口,计算该窗口的平均值和标准差,然后使用正态分布替换NaN,而其余信号保持不变 import numpy as np def replace_nan(signal, window = 5): """ calculate moving average and std of signal window without nan replaces nan values with normal distr

我需要用numpy中的局部正态分布替换一维阵列的NAN。我选择一个窗口,计算该窗口的平均值和标准差,然后使用正态分布替换NaN,而其余信号保持不变

import numpy as np

def replace_nan(signal, window = 5):
    """
    calculate moving average and std of signal window without nan    
    replaces nan values with normal distribution (mean, std)    
    """
# add padding in case signal starts/ends with nan
    signal = np.pad(signal, (window, window), 'mean', stat_length = 2*window)    

    for k in range(window,len(signal)-window):        
        mean = np.nanmean(signal[k-window:k+window])  # window average 
        std = np.nanstd(signal[k-window:k+window]) # window std without nan 

        ind = np.where(np.isnan(signal[k-window:k+window]))[0]    
        print (ind)   
        signal[ind]= np.random.normal(mean, std)

    signal = signal[window:len(signal)-window] #remove padding

    return signal

#tester 
signal = np.array([0.71034849, 0.17730998, 0.77577915, 0.38308111, 
0.24278947, np.nan, np.nan, 0.68694097, 0.6684736 , 0.47310845, 0.22210945, 
0.1189111, np.nan, np.nan, np.nan, 0.5573841 , 0.57531205, 0.74131346, 
0.29088101, 0.5573841 , 0.57531205, 0.74131346, np.nan, np.nan, np.nan, 
np.nan, 0.49534304, 0.18370482, 0.06089498, 0.22210945, 0.1189111])        

signal = replace_nan(signal, 5)

print(signal)

我用正态分布np.random.normal替换nans,并为大小为5的移动窗口计算平均值和标准差。当我选择信号窗口的NAN来替换它们时,出现了一些问题。这应该很容易,我只是python的初学者

我还没有测试这些数字是否准确,但我认为这会起作用:

import numpy as np

def replace_nan(signal, window = 5):
    """
    calculate moving average and std of signal window without nan
    replaces nan values with normal distribution (mean, std)
    """

    # add padding in case signal starts/ends with nan
    signal = np.pad(signal, (window, window), 'mean', stat_length = 2*window)

    for k in range(window, len(signal) - window + 1):
        mean = np.nanmean(signal[k-window:k+window])  # window average
        std = np.nanstd(signal[k-window:k+window]) # window std without nan

        if np.isnan(signal[k]):
            signal[k] = np.random.normal(mean, std)

    signal = signal[window:len(signal)-window] #remove padding

    return signal

#tester
signal = np.array(
    [
        0.71034849, 0.17730998, 0.77577915, 0.38308111, 0.24278947, np.nan,
        np.nan, 0.68694097, 0.6684736 , 0.47310845, 0.22210945, 0.1189111,
        np.nan, np.nan, np.nan, 0.5573841 , 0.57531205, 0.74131346,
        0.29088101, 0.5573841 , 0.57531205, 0.74131346, np.nan, np.nan,
        np.nan, np.nan, 0.49534304, 0.18370482, 0.06089498, 0.22210945,
        0.1189111
    ]
)

print("Before:")
print(signal)

signal = replace_nan(signal, 5)

print("\nAfter:")
print(signal)
这使得:

Before:
[ 0.71034849  0.17730998  0.77577915  0.38308111  0.24278947         nan
         nan  0.68694097  0.6684736   0.47310845  0.22210945  0.1189111
         nan         nan         nan  0.5573841   0.57531205  0.74131346
  0.29088101  0.5573841   0.57531205  0.74131346         nan         nan
         nan         nan  0.49534304  0.18370482  0.06089498  0.22210945
  0.1189111 ]

After:
[ 0.71034849  0.17730998  0.77577915  0.38308111  0.24278947  0.35960417
  0.508657    0.68694097  0.6684736   0.47310845  0.22210945  0.1189111
  0.50282732  0.34906067  0.31206557  0.5573841   0.57531205  0.74131346
  0.29088101  0.5573841   0.57531205  0.74131346  0.80133879  0.63122315
  0.49236281  0.35630875  0.49534304  0.18370482  0.06089498  0.22210945
  0.1189111 ]

我还没有测试这些数字是否准确,但我认为这会起作用:

import numpy as np

def replace_nan(signal, window = 5):
    """
    calculate moving average and std of signal window without nan
    replaces nan values with normal distribution (mean, std)
    """

    # add padding in case signal starts/ends with nan
    signal = np.pad(signal, (window, window), 'mean', stat_length = 2*window)

    for k in range(window, len(signal) - window + 1):
        mean = np.nanmean(signal[k-window:k+window])  # window average
        std = np.nanstd(signal[k-window:k+window]) # window std without nan

        if np.isnan(signal[k]):
            signal[k] = np.random.normal(mean, std)

    signal = signal[window:len(signal)-window] #remove padding

    return signal

#tester
signal = np.array(
    [
        0.71034849, 0.17730998, 0.77577915, 0.38308111, 0.24278947, np.nan,
        np.nan, 0.68694097, 0.6684736 , 0.47310845, 0.22210945, 0.1189111,
        np.nan, np.nan, np.nan, 0.5573841 , 0.57531205, 0.74131346,
        0.29088101, 0.5573841 , 0.57531205, 0.74131346, np.nan, np.nan,
        np.nan, np.nan, 0.49534304, 0.18370482, 0.06089498, 0.22210945,
        0.1189111
    ]
)

print("Before:")
print(signal)

signal = replace_nan(signal, 5)

print("\nAfter:")
print(signal)
这使得:

Before:
[ 0.71034849  0.17730998  0.77577915  0.38308111  0.24278947         nan
         nan  0.68694097  0.6684736   0.47310845  0.22210945  0.1189111
         nan         nan         nan  0.5573841   0.57531205  0.74131346
  0.29088101  0.5573841   0.57531205  0.74131346         nan         nan
         nan         nan  0.49534304  0.18370482  0.06089498  0.22210945
  0.1189111 ]

After:
[ 0.71034849  0.17730998  0.77577915  0.38308111  0.24278947  0.35960417
  0.508657    0.68694097  0.6684736   0.47310845  0.22210945  0.1189111
  0.50282732  0.34906067  0.31206557  0.5573841   0.57531205  0.74131346
  0.29088101  0.5573841   0.57531205  0.74131346  0.80133879  0.63122315
  0.49236281  0.35630875  0.49534304  0.18370482  0.06089498  0.22210945
  0.1189111 ]

你可以通过举一个具体的例子来改进你的问题。嗯,我需要想想怎么做。我使用大信号并绘制它们以查看具体示例。但看起来我不能在这里上传信号和数字/或者我可以上传但没有找到哪里?例如,我可能错过了包含数据/数字的地方。signal=np.array[0.71034849,0.17730998,0.77577915,0.38308111,0.24278947,np.nan,0.68694097,0.6684736,0.47310845,0.22210945,0.1189111,np.nan,0.5573841,0.57531205,0.74131346,0.29088101,np.nan,0.49534304,0.18370482,0.06089498]?问题的正文字段有一个带有几个小图标的色带。您可以通过将鼠标悬停在它们上方来查看它们执行的功能。插入图形有帮助,但还不够。最好还包括一个小数据示例,在图中显示您的代码正在使用它做什么,然后显示所需的输出。尝试编辑,也许现在更好您可以通过给出一个具体示例来改进您的问题。嗯,我需要考虑如何做。我使用大信号并绘制它们以查看具体示例。但看起来我不能在这里上传信号和数字/或者我可以上传但没有找到哪里?例如,我可能错过了包含数据/数字的地方。signal=np.array[0.71034849,0.17730998,0.77577915,0.38308111,0.24278947,np.nan,0.68694097,0.6684736,0.47310845,0.22210945,0.1189111,np.nan,0.5573841,0.57531205,0.74131346,0.29088101,np.nan,0.49534304,0.18370482,0.06089498]?问题的正文字段有一个带有几个小图标的色带。您可以通过将鼠标悬停在它们上方来查看它们执行的功能。插入图形有帮助,但还不够。最好还包括一个小数据示例,在图中显示您的代码正在使用它做什么,然后显示所需的输出。尝试编辑,可能现在更好,完美。我知道这应该很容易!如果np.isnansignal[k]:signal[k]=np.random.normalmean,std更好,谢谢。有点离题:我发现我的sintax习惯不太好,可以在StackOverflow中添加工作部件并征求语法改进建议吗?Np!:语法问题?不确定这是StackOverflow的最佳问题。但我现在可以给你一些建议。python程序员通常遵循这一原则。如果你使用了,它会让你在键入时知道你的语法是否关闭。谢谢,通过pep8查看,是的,我做错了。。。。你现在有魅力了。最好从一开始就养成好习惯,然后再改变。是的,太好了。我知道这应该很容易!如果np.isnansignal[k]:signal[k]=np.random.normalmean,std更好,谢谢。有点离题:我发现我的sintax习惯不太好,可以在StackOverflow中添加工作部件并征求语法改进建议吗?Np!:语法问题?不确定这是StackOverflow的最佳问题。但我现在可以给你一些建议。python程序员通常遵循这一原则。如果你使用了,它会让你在键入时知道你的语法是否关闭。谢谢,通过pep8查看,是的,我做错了。。。。你现在有魅力了。最好从一开始就养成好习惯,以后再改变