Python 使用IMDB数据集查询交叉表_Python_Pandas_Crosstab

Python 使用IMDB数据集查询交叉表

python pandas

Python 使用IMDB数据集查询交叉表,python,pandas,crosstab,Python,Pandas,Crosstab,我正在使用IMDB数据集，最后提到了标题问题: 1.该报告捕捉了多年来电影标题中字母数量的趋势。 2.视频发布年份和长度所属分位数之间的交叉表。结果应包括年份、最小长度、最大长度、数字视频小于25个百分点、数字视频25个百分点、数字视频50个百分点、数字视频大于75个百分点第一部分很容易解决。对于第二部分，作为交叉表的初学者，我知道语法，但有人可以指导如何继续解决这个问题如果需要更多信息，请告诉我 imdb.columns Index(['fn', 'tid', 'title',

我正在使用IMDB数据集，最后提到了标题

问题: 1.该报告捕捉了多年来电影标题中字母数量的趋势。 2.视频发布年份和长度所属分位数之间的交叉表。结果应包括年份、最小长度、最大长度、数字视频小于25个百分点、数字视频25个百分点、数字视频50个百分点、数字视频大于75个百分点

第一部分很容易解决。对于第二部分，作为交叉表的初学者，我知道语法，但有人可以指导如何继续解决这个问题

如果需要更多信息，请告诉我

    imdb.columns
Index(['fn', 'tid', 'title', 'wordsInTitle', 'url', 'imdbRating',
       'ratingCount', 'duration', 'year', 'type', 'nrOfWins',
       'nrOfNominations', 'nrOfPhotos', 'nrOfNewsArticles', 'nrOfUserReviews',
       'nrOfGenre', 'Action', 'Adult', 'Adventure', 'Animation', 'Biography',
       'Comedy', 'Crime', 'Documentary', 'Drama', 'Family', 'Fantasy',
       'FilmNoir', 'GameShow', 'History', 'Horror', 'Music', 'Musical',
       'Mystery', 'News', 'RealityTV', 'Romance', 'SciFi', 'Short', 'Sport',
       'TalkShow', 'Thriller', 'War', 'Western'],
      dtype='object')

可以使用以下方法解决此问题：

结果:

        min_length  max_length  num_videos_less_than25Percentile  num_videos_25_50Percentile  num_videos_50_75Percentile  num_videos_greaterthan75Precentile
year                                                                                                                                                        
1888.0           2           2                                 0                           0                           0                                   1
1894.0          22          22                                 0                           0                           0                                   1
1904.0        <NA>        <NA>                                 0                           0                           0                                   0
1910.0         660         660                                 0                           0                           0                                   1
1911.0        1080        1080                                 0                           0                           0                                   1
...            ...         ...                               ...                         ...                         ...                                 ...
2009.0        3000        7080                                 4                           2                           5                                   4
2010.0        3120        6840                                 1                           1                           1                                   2
2011.0        5580        7620                                 2                           1                           1                                   2
2012.0        1800        6000                                 1                           0                           1                                   1
2014.0        1800        1800                                 0                           0                           0                                   1

[94 rows x 6 columns]

        min_length  max_length  num_videos_less_than25Percentile  num_videos_25_50Percentile  num_videos_50_75Percentile  num_videos_greaterthan75Precentile
year                                                                                                                                                        
1888.0           2           2                                 0                           0                           0                                   1
1894.0          22          22                                 0                           0                           0                                   1
1904.0        <NA>        <NA>                                 0                           0                           0                                   0
1910.0         660         660                                 0                           0                           0                                   1
1911.0        1080        1080                                 0                           0                           0                                   1
...            ...         ...                               ...                         ...                         ...                                 ...
2009.0        3000        7080                                 4                           2                           5                                   4
2010.0        3120        6840                                 1                           1                           1                                   2
2011.0        5580        7620                                 2                           1                           1                                   2
2012.0        1800        6000                                 1                           0                           1                                   1
2014.0        1800        1800                                 0                           0                           0                                   1

[94 rows x 6 columns]