Python 在我的matplotlib图形上绘制了多少个数据点？_Python_Pandas_Matplotlib

Python 在我的matplotlib图形上绘制了多少个数据点？

python pandas matplotlib

Python 在我的matplotlib图形上绘制了多少个数据点？,python,pandas,matplotlib,Python,Pandas,Matplotlib,因此，我想计算绘制在图形上的数据点的数量，以保持图形化数据的总轨迹。问题是，我的数据表会把它弄乱，使它在不同的行中有一些NaN值，而在另一列中可能有或没有NaN值。例如： # I use num1 as my y-coordinate and num1-num2 for my x-coordinate. num1 num2 num3 1 NaN 25 NaN 7 45 3 8 63 NaN NaN 23 5 10 42 NaN 4 44 #

因此，我想计算绘制在图形上的数据点的数量，以保持图形化数据的总轨迹。问题是，我的数据表会把它弄乱，使它在不同的行中有一些NaN值，而在另一列中可能有或没有NaN值。例如：

# I use num1 as my y-coordinate and num1-num2 for my x-coordinate.
num1 num2 num3 
1    NaN  25 
NaN  7    45
3    8    63
NaN  NaN  23
5    10   42
NaN  4    44

#So in this case, there should be only 2 data point on the graph between num1 and num2. For num1 and num3, there should be 3. There should be 4 data points between num2 and num3.

我相信Matplotlib不会绘制包含NaN值的列的行，因为它为null（如果我错了，请纠正我，因为在x轴和y轴的0坐标上没有点，所以我只能告诉你）。一开始，我认为我可以不用使用.count（），找到两列中较小的一列，并将其用作我的跟踪器，但实际上，这不会像上面的示例中所示那样起作用，因为它甚至可以小于该值，因为其中一列可能具有NaN值，而另一列则具有实际值。我做过的一些代码示例：

# both x and y are columns within the DataFrame and are used to "count" how many data points are # being graphed.
def findAmountOfDataPoints(colA, colB):
    if colA.count() < colB.count():
         print(colA.count())           # Since its a smaller value, print the number of values in colA.
    else: 
         print(colB.count())              # Since its a smaller value, print the number of values in colB.

#x和y都是数据帧中的列，用于“计算”正在绘制的数据点数量。
def FindAmountof数据点（可乐、可乐）：
如果colA.count（）


此外，我还考虑过使用.value_count（），但我不确定这是否正是我想要的函数。有什么建议吗
编辑1：更改数据帧名称以使示例更加清晰。
如果我正确理解了您的问题，假设您的表是一个数据帧df
，则以下代码应该可以工作：
sum((~np.isnan(df['num1']) & (~np.isnan(df['num2']))))

工作原理：
np.isnan
如果单元格为Nan，则返回True~np.isnan
是相反的，因此当它不是Nan时返回True
代码检查列“num1”和列“num2”包含非Nan值的位置，换句话说，对于存在这两个值的行，它返回True
最后，使用sum
计算好的行数，它只考虑真实值。
我的理解是，需要非NaN
的点组合数。使用我发现的一个函数，我得出以下结论：
import pandas as pd
import numpy as np

def choose(n, k):
    """
    A fast way to calculate binomial coefficients by Andrew Dalke (contrib).
    https://stackoverflow.com/questions/3025162/statistics-combinations-in-python
    """
    if 0 <= k <= n:
        ntok = 1
        ktok = 1
        for t in range(1, min(k, n - k) + 1):
            ntok *= n
            ktok *= t
            n -= 1
        return ntok // ktok
    else:
        return 0


data = {'num1': [1, np.nan,3,np.nan,5,np.nan],
        'num2': [np.nan,7,8,np.nan,10,4],
        'num3': [25,45,63,23,42,44]
        }

df = pd.DataFrame(data)

df['notnulls'] = df.notnull().sum(axis=1)

df['plotted'] = df.apply(lambda row: choose(int(row.notnulls), 2), axis=1)
print(df)
print("Total data points: ", df['plotted'].sum())

@TrentonMcKinney在我的例子中解释了这个问题。我只能找到两列中最低的一列，但是，在另一列中可能有一个NaN值，如我的示例中所示，它不会被绘制出来。我将改变我的例子来说明这一点。@TrentonMcKinney好的，我更新了我的例子来解释这是如何不起作用的。查看num1如何具有3个非NaN值，num2如何具有4个非NaN值？现在，当要将这两个图形一起绘制时，将只绘制第3行和第5行（如果希望从0开始，则绘制第2行和第4行）。因此，图表上只有2个数据点，使用df.count（）无法找到这些数据点。@TrentonMcKinney抱歉，我留下这些数据点是为了表明我试图解决问题，但没有得到我想要的结果。那么，df.dropna（）是否完全忽略了我得到的数据帧中的整行？我仍然想在我的数据框中保留这一行，因为我有一个大的x列，我仍然想为其他图形保留它。从理论上讲，这也是有道理的。那么np.isnan（）是否每次都遍历每个单元格？在某种程度上，是的。np.isnan（）检查输入数组（在本例中为数据帧的列），并返回一个形状相同的布尔数组，其中只有True（对于输入数组中为Nan的单元格）和False（对于非Nan的单元格）。如果使用~np.isnan，则相反。
   num1  num2  num3  notnulls  plotted
0   1.0   NaN    25         2        1
1   NaN   7.0    45         2        1
2   3.0   8.0    63         3        3
3   NaN   NaN    23         1        0
4   5.0  10.0    42         3        3
5   NaN   4.0    44         2        1
Total data points:  9