如何在Jupyter笔记本中使用Python以尽可能少的代码创建给定数据的频率分布表？_Python_Pandas_Statistics_Jupyter Notebook

如何在Jupyter笔记本中使用Python以尽可能少的代码创建给定数据的频率分布表？

python pandas statistics jupyter-notebook

如何在Jupyter笔记本中使用Python以尽可能少的代码创建给定数据的频率分布表？,python,pandas,statistics,jupyter-notebook,Python,Pandas,Statistics,Jupyter Notebook,总结这些数据，建立一个频率分布。这些数据是一个对象在20天内的需求 211021302403432430。任务是在jupyter笔记本中创建一个包含列Demand和Frequency的表。注意：需求必须按升序排列。这就是我所做的 list_of_days = [2, 1, 0, 2, 1, 3, 0, 2, 4, 0, 3, 2 ,3, 4, 2, 2, 2, 4, 3, 0] # created a list of the data import pandas as pd series_of_

总结这些数据，建立一个频率分布。这些数据是一个对象在20天内的需求

211021302403432430。任务是在jupyter笔记本中创建一个包含列Demand和Frequency的表。注意：需求必须按升序排列。这就是我所做的

list_of_days = [2, 1, 0, 2, 1, 3, 0, 2, 4, 0, 3, 2 ,3, 4, 2, 2, 2, 4, 3, 0] # created a list of the data
import pandas as pd
series_of_days = pd.Series(list_of_days) # converted the list to series
series_of_days.value_counts(ascending = True) # the frequency was ascending but not the demand
test = dict(series_of_days.value_counts())
freq_table =  pd.Series(test)
pd.DataFrame({"Demand":freq_table.index, "Frequency":freq_table.values})

输出必须如下所示：

<table border = "1">

  <tr>
    <td>Demand</td>
    <td>Frequency</td>
  </tr>
  <tr>
    <td>0</td>
    <td>4</td>
  </tr>
  <tr>
    <td>1</td>
    <td>2</td>
  </tr>
  <tr>
    <td>2</td>
    <td>7</td>
  </tr>
<table>


需要
频率
0
4.
1.
2.
2.
7.

等等。有没有更好的方法来缩短Python代码？还是让它更有效率

import collections
collections.Counter(list_of_days)

应该按照您所描述的进行操作

您可以通过以下方式使用和排序：

另一个类似的解决方案，通过以下方式进行排序：

我将创建您发布的HTML表的文本

pd.value_counts([2,1,0,2,1,3,0,2,4,0,3,2,3,4,2,2,2,4,3,0]).to_frame(name='Frequency').rename_axis('Demand', 1).sort_index()


需要
频率
0
4.
1.
2.
2.
7.
3.
4.
4.
3.

如果您想要最短的，可能是此代码，默认情况下计数器将按升序对键进行排序

list_of_days = [2, 1, 0, 2, 1, 3, 0, 2, 4, 0, 3, 2, 3, 4, 2, 2, 2, 4, 3, 0]  
day_counter = Counter(list_of_days).items()
data = [ [a,b] for a,b in day_counter ]
print(data)

[0,4]，[1,2]，[2,7]，[3,4]，[4,3]]

一个更好的问题是为什么？如果你坚持，试试看。如果你真的想把它缩短，你可以把一些逐行的语句压缩成一行。你的7行太多了？为什么今天有这么多家庭作业问题？代码审查不是问熊猫问题的最佳场所这项工作效率很高，节省了大量不必要的转换。确切地说，转换不是必需的。它完成了一半的工作，但效率很高。

pd.value_counts([2,1,0,2,1,3,0,2,4,0,3,2,3,4,2,2,2,4,3,0]).to_frame(name='Frequency').rename_axis('Demand', 1).sort_index()

<table border="1" class="dataframe">
  <thead>
    <tr style="text-align: right;">
      <th>Demand</th>
      <th>Frequency</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <th>0</th>
      <td>4</td>
    </tr>
    <tr>
      <th>1</th>
      <td>2</td>
    </tr>
    <tr>
      <th>2</th>
      <td>7</td>
    </tr>
    <tr>
      <th>3</th>
      <td>4</td>
    </tr>
    <tr>
      <th>4</th>
      <td>3</td>
    </tr>
  </tbody>
</table>

list_of_days = [2, 1, 0, 2, 1, 3, 0, 2, 4, 0, 3, 2, 3, 4, 2, 2, 2, 4, 3, 0]  
day_counter = Counter(list_of_days).items()
data = [ [a,b] for a,b in day_counter ]
print(data)