Python 如何将给定数据集中的值添加到空字典中_Python_Pandas_Dataframe_Dictionary

Python 如何将给定数据集中的值添加到空字典中

python pandas dataframe dictionary

Python 如何将给定数据集中的值添加到空字典中,python,pandas,dataframe,dictionary,Python,Pandas,Dataframe,Dictionary,下面的代码由我正在参加的Python课程练习中的Datacamp版权所有我得到了一个csv文件，数据集包含Twitter数据，我必须迭代一列中的条目来构建一个字典，其中键是语言的名称，值是给定语言中的推文数量。生成的代码是正确的和有效的然而，我不能完全理解if-else语句部分中的代码是如何工作的代码的输出是：{'en'：97，'et'：1，'und'：2} 我的问题是：我们如何获得上述给定的输出。for循环中的代码内部到底发生了什么，以及if-else 谢谢大家! 我在中为| i

我得到了一个csv文件，数据集包含Twitter数据，我必须迭代一列中的条目来构建一个字典，其中键是语言的名称，值是给定语言中的推文数量。生成的代码是正确的和有效的然而，我不能完全理解if-else语句部分中的代码是如何工作的

代码的输出是：
{'en'：97，'et'：1，'und'：2}

我的问题是：我们如何获得上述给定的输出。for循环中的代码内部到底发生了什么，以及if-else

谢谢大家!
我在
中为| if | else
添加了一些解释性说明，以根据要求提高对代码的理解
为了便于解释，我将数据集更改为一个最小的示例
作为一般提示：Pandas有一个内置方法（
value\u counts
）可以实现同样的功能

# Import pandas import pandas as pd # Import Twitter data as DataFrame: df # df = pd.read_csv('tweets.csv') df = pd.DataFrame( data=[ 'en', # 1st row 'en', # 2nd row 'und', # 3rd row 'et', # 4th row 'und' # 5th row ], columns=['lang'] ) # Initialize an empty dictionary: langs_count langs_count = {} # Extract column from DataFrame: col col = df['lang'] print('before the loop, langs_count is an empty dict') print(langs_count, '\n') # Iterate over lang column in DataFrame for ii, entry in enumerate(col): # If the language is in langs_count, add 1 if entry in langs_count.keys(): print(f'{ii}\nif: the key "{col.iloc[ii]}" exists, so adds 1 to value') langs_count[entry] += 1 # Else add the language to langs_count, set the value to 1 else: print(f'{ii}\nelse: the key "{col.iloc[ii]}" does not exist, so create it with value 1') langs_count[entry] = 1 print(langs_count, '\n') # Print the populated dictionary # print(langs_count) #{'en': 97, 'et': 1, 'und': 2} # the same could be reached through # without the need of loop or if / else print('value_counts solution') df['lang'].value_counts().to_dict()
输出：

感谢您的清晰解释和您添加的额外知识。它非常有用！
# Import pandas import pandas as pd # Import Twitter data as DataFrame: df # df = pd.read_csv('tweets.csv') df = pd.DataFrame( data=[ 'en', # 1st row 'en', # 2nd row 'und', # 3rd row 'et', # 4th row 'und' # 5th row ], columns=['lang'] ) # Initialize an empty dictionary: langs_count langs_count = {} # Extract column from DataFrame: col col = df['lang'] print('before the loop, langs_count is an empty dict') print(langs_count, '\n') # Iterate over lang column in DataFrame for ii, entry in enumerate(col): # If the language is in langs_count, add 1 if entry in langs_count.keys(): print(f'{ii}\nif: the key "{col.iloc[ii]}" exists, so adds 1 to value') langs_count[entry] += 1 # Else add the language to langs_count, set the value to 1 else: print(f'{ii}\nelse: the key "{col.iloc[ii]}" does not exist, so create it with value 1') langs_count[entry] = 1 print(langs_count, '\n') # Print the populated dictionary # print(langs_count) #{'en': 97, 'et': 1, 'und': 2} # the same could be reached through # without the need of loop or if / else print('value_counts solution') df['lang'].value_counts().to_dict()

""" before the loop, langs_count is an empty dict {} 0 else: the key "en" does not exist, so create it with value 1 {'en': 1} 1 if: the key "en" exists, so adds 1 to value {'en': 2} 2 else: the key "und" does not exist, so create it with value 1 {'en': 2, 'und': 1} 3 else: the key "et" does not exist, so create it with value 1 {'en': 2, 'und': 1, 'et': 1} 4 if: the key "und" exists, so adds 1 to value {'en': 2, 'und': 2, 'et': 1} value_counts solution {'en': 2, 'et': 1, 'und': 2} """