Python 计算熊猫每天看到的新ID的数量_Python_Pandas - Fatal编程技术网

Python 计算熊猫每天看到的新ID的数量

python pandas

Python 计算熊猫每天看到的新ID的数量,python,pandas,Python,Pandas,鉴于以下数据，其中用户每天都很活跃，我想使用Pandas计算每天新用户的数量 Data: Day | UserID ---------- 1 | A 1 | B 1 | C 1 | C ---------- 2 | A 2 | B 2 | D 2 | A 2 | E ---------- 3 | B 3 | D 3 | F Result:

鉴于以下数据，其中用户每天都很活跃，我想使用Pandas计算每天新用户的数量

Data:
  Day | UserID
  ----------
   1  |  A 
   1  |  B
   1  |  C
   1  |  C
   ----------
   2  |  A 
   2  |  B
   2  |  D
   2  |  A 
   2  |  E
  ----------
   3  |  B 
   3  |  D
   3  |  F

Result:    
  Day | New Users
  ---------------
   1  |  3
   2  |  2
   3  |  1

在我看来，步骤如下：

计算每天的最大用户ID:df.groupby（'day'）.UserID.max（）

使用前一天的最大用户ID（初始化为0）筛选数据：这里我不知道如何使用Pandas进行此操作

计算过滤数据集中来自唯一用户的登录数df.filtered.groupby（'Day'）.UserID.nunique（）

是否有一种干净的方法来实现这一点？
此构建为一个表，给出每个ID第一次出现的日期，按天分组，然后统计相应的行

df = pd.DataFrame([(1, "A"), (1, "B"), (1, "C"), (1, "C"), (2, "A"), (2, "B"), (2, "D"), (2, "A"), (2, "E"), (3, "B"), (3, "D"), (3, "F")], columns=["day", "userid"]) (df .sort_values('day') .groupby('userid') .first() .rename(columns={"day": "first_seen"}) .groupby('first_seen').size() )

假设数据帧首先按
天
排序，您可以在
用户ID
上分组（将
设置为_index=False
），然后将结果的索引设置为
天
。这将为您每天提供每个新用户

df2 = df.groupby('UserID', as_index=False).Day.first().set_index('Day') >>> df2 UserID Day 1 A 1 B 1 C 2 D 2 E 3 F
然后获取新用户的总数：

>>> df2.groupby(level=0).UserID.count() Day 1 3 2 2 3 1 Name: UserID, dtype: int64

[pandas]相关文章推荐

Pandas 是否从现有数据框列创建新的数据框列？ pandas

Pandas 数据帧行删除 pandas

Pandas 熊猫未熔化数据集 pandas

使用pandas to_Datetime将秒转换为Datetime，而不会降低微秒精度 pandas

Pandas 用于在CloudML上部署的TensorFlow输入管道 pandas input tensorflow

Pandas 无法稀疏或pickle数据帧（内核崩溃） pandas numpy matrix

Pandas 非方形数据帧的方形元素 pandas numpy

Pandas 使用python列表作为数据类型 pandas

Pandas 将csv文件中的某些列相乘 pandas shell csv awk

Pandas 多索引isn'；当pd包含多个小计行时保留t pandas

Pandas groubby month并将其转换为json pandas dataframe

Pandas 在数据框/csv文件顶部写入行（将列表集合并到数据框） pandas csv

Pandas 如何从表中的所有行中删除组平均值/min/max pandas

Pandas 是否有一种方法可以使用Seaborn为同一图形使用多个子图？ pandas matplotlib

Pandas 操作2个数据帧以查找匹配项并返回索引 pandas

Pandas 熊猫多柱熔化 pandas

Pandas jupyter用其他数据帧替换列值 pandas dataframe replace

Pandas 大熊猫按行分组？ pandas dataframe

Pandas 从布尔值获取条形图？ pandas list python-2.7 dataframe

使用pandas从字符串中删除所有字母数字单词 pandas dataframe

随机文章推荐

Internet explorer 8 为什么一些Iframes应用程序在IE8中工作，而其他应用程序不工作 internet-explorer-8 iframe

Internet explorer 8 强制浏览器模式=IE8，文档模式=IE8标准 internet-explorer-8

Internet explorer 8 IE6与IE8<；ul>&书信电报；李>；故障显示问题。请帮忙 internet-explorer-8

Internet explorer 8 IE7/8中的CSS导航菜单存在问题 internet-explorer-8 css navigation

Internet explorer 8 为什么Sahi自动化测试用例不是'；你不是在IE8上跑步吗？ internet-explorer-8

Internet explorer 8 Fancybox幻灯片在IE8的页面底部，背景下 internet-explorer-8

Internet explorer 8 IE8最小宽度和宽度；绝对定位问题 internet-explorer-8 css

Internet explorer 8 IE8。溢出隐藏+；最小宽度：这种奇怪的行为有解决办法吗？ internet-explorer-8 css

Internet explorer 8 自定义字体（@font-face）和谷歌字体不'；我不在IE8工作 internet-explorer-8

Internet explorer 8 使用r2d3.js向图表添加data-*属性 internet-explorer-8 d3.js

Internet explorer 8 ：之前和：之后不在IE8中工作，尽管设置了DOCTYPE internet-explorer-8 css

Internet explorer 8 内联块不适用于IE8 internet-explorer-8 css

Internet explorer 8 媒体查询在文档模式下不工作：ie8标准 internet-explorer-8 responsive-design

Internet explorer 8 在IE8中测试jvectormap.com internet-explorer-8

Internet explorer 8 bootstrap与IE8和IE9的简单转盘兼容性 internet-explorer-8

[python]相关推荐

Python Windows 8上的应用程序崩溃
Python Windows Wxpython

如何在python中调用runas管理员？
Python Linux Windows Macos

通过PowerShell运行python脚本时出现UnicodeEncodeError
Python Windows Unicode

Python 如何在远程服务器中将NetCDF子集并将子集文件scp到本地服务器
Python Ssh

Python 使用scipy.io.wavfile.write写入正弦波并读取它
Python

Python Django ORM查询或列表字典
Python Sql Django Dictionary Orm

Python导入依赖项
Python Opencv Import

Python 熊猫：按小时平均值排序数据帧时丢失数据
Python Pandas

Python 熊猫：当用NaN按列分组时，如何显示NaN？
Python Pandas

在Python中比较几乎相等的浮点的最佳方法是什么？
Python Floating Point

Python 如何在mongoengine中链接查询范围
Python

Python 在kivy中触摸（单击）matplotlib图形将崩溃
Python Matplotlib

Python Tensorflow-单个占位符或它们的列表
Python Machine Learning Tensorflow

python中的字符串解析/替换（使用reg表达式）
Python Regex String Parsing

Python 破折号：“；锁；执行触发功能时按下按钮
Python

Python discord.py：交替'；玩'；状态无效，命令已禁用
Python Discord.py

如何避免python代码将值舍入到小数点后1位
Python Pandas Dataframe

Python 无法导入我通过Pycharm（datareader）安装的包
Python Pycharm

Python 使用Scrapy获取页面，执行JS并提取变量
Python Web Scraping Scrapy

Python 使用mysql数据库配置apache超集
Python Mysql Sqlite Flask

Python django链接到另一个页面不会呈现新页面
Python Django

Python 更改列表元素的平铺
Python List

自定义类中的Python生成器
Python

.pyc编译python的替代方法
Python Compilation

Python 跳过列表的第一个列表
Python

Python 如何转换'；2020-09-30 23:45:27+；0000和x27；迄今为止的对象？
Python Python 3.x

Python 通过跟踪交换的索引对列表进行排序，以重新形成第二个列表
Python Sorting

Python Can'；t安装pytorch：未找到火炬==1.7.0+的匹配分布；中央处理器
Python Pip Pytorch

无法使用python从JSON生成正确的csv文件
Python Json Jupyter Notebook

Python 我尝试了一个随机引用命令，但它不起作用？
Python Discord Discord.py

Tags

Properties Ssl Liferay Mqtt Wxpython Database Design Automation Osgi Design Patterns Swift3 Wolfram Mathematica Internet Explorer Php Unity3d Sed Three.js Embedded Configuration Testng Apache Camel Stm32 Jetty Drupal 7 Iframe Asp.net Mvc 2 Exchange Server Syntax Outlook Imagemagick Ssas Exception Odata Jenkins Nhibernate Webrtc Bison Nativescript Wso2 Google Maps Hive Pine Script Deployment Notifications Dictionary Google Chrome Devtools Svg Perl Weblogic Protocol Buffers Parse Platform Logstash Sharepoint Wcf Omnet++ Methods Nlp Tree Ffmpeg Hyperlink Coffeescript Marklogic Mfc Pdf Data Structures Xamarin.ios Visual Studio 2015 Netsuite Titanium Google Maps Api 3 Hash Telegram Dojo Apache Pig Curl Grafana Fortran User Interface Azure Google Cloud Firestore Grep Shiny Mpi Geometry Umbraco Google Chrome Extension Google Cloud Dataflow Html Pytorch Terraform Amazon Dynamodb Jupyter Notebook Fullcalendar Rspec Jmeter Snmp Reporting Services Http Reference Mdx Boost Tfs Binding Openstack Udp Jqgrid Jakarta Ee Devexpress Plugins Rest Protractor Mono C Ios5 Ide Xpath Sql Server 2012 Multithreading Jersey Orm Menu File Asp.net Mvc 3 Vba Pandas Powershell Sql Server Filesystems Google Drive Api Templates Visual Studio 2013 Gtk Struct Scripting Webstorm Vb.net File Io Google App Maker Inno Setup Clearcase Postgresql Doctrine Orm Jdbc Oracle Apex Functional Programming Clojure Activerecord Sequelize.js Shopify Gnuplot Svn Mapreduce Streaming Blazor Isabelle Sublimetext2 Windows Amazon S3 Material Ui Plone Visual Studio 2010 Iis Soap Groovy Spotify Nosql Printing Ftp Version Control Prestashop Jekyll Jquery Plugins Websocket Firefox Addon Install4j Graph Variables Jboss Compiler Errors Magento Aurelia Ubuntu Jira Spring Integration Entity Framework 4 Loops Sql Server 2008 R2 Azure Service Fabric Calendar Django Rest Framework Silverlight 4.0 Serial Port If Statement Neo4j Continuous Integration Bazel Exception Handling Activemq Windows 7 Pointers Actionscript Random

Copyright © 2024. All Rights Reserved by - Fatal编程技术网