Python 如何在数据帧中将二进制数据拆分为三列

Python 如何在数据帧中将二进制数据拆分为三列,python,pandas,Python,Pandas,我已经做了一段时间了,对我来说没什么意义 我有以下形式的twitter数据: column Lines "585978391360221184|Thu Apr 09 01:31:50 +0000 2015|Breast cancer risk test devised http://bbc.in/1CimpJF" "585947808772960257|Wed Apr 08 23:30:18 +0000 2015|GP workload harming care - BMA poll h

我已经做了一段时间了,对我来说没什么意义

我有以下形式的twitter数据:

column Lines

"585978391360221184|Thu Apr 09 01:31:50 +0000 2015|Breast cancer risk test devised 
http://bbc.in/1CimpJF"

"585947808772960257|Wed Apr 08 23:30:18 +0000 2015|GP workload harming care - BMA poll 
http://bbc.in/1ChTBRv"

"585947807816650752|Wed Apr 08 23:30:18 +0000 2015|Short people's 'heart risk greater' 
http://bbc.in/1ChTANp"

"585866060991078401|Wed Apr 08 18:05:28 +0000 2015|New approach against HIV 'promising' 
http://bbc.in/1E6jAjt"

"585794106170839041|Wed Apr 08 13:19:33 +0000 2015|Coalition 'undermined NHS' - doctors 
http://bbc.in/1CnLwK7"

'586266687017771008|Thu Apr 09 20:37:25 +0000 2015|Sabra hummus recalled in U.S. 
 http://www.cbc.ca/news/health/sabra-hummus-recalled-in-u-s-1.3026865?cmp=rss'
我需要使用|字符将数据分为数据帧中的三列

我从文本文件中读取数据并将其转换为dataframe。列的名称为Lines

data = []

for f in all_files:
    if f =='Health-Tweets.py' or f =='Heath-Tweets.py' :
        continue
    else:
        with open(f, "rb") as myfile:
            data1 = myfile.readlines()
            if not data1:
                continue          
            print(f)            
            data.append(data1)

# flatening the list data
data2 = [j for sub in data for j in sub]

# transforming the data to dataframe
df = pd.DataFrame(data2)
# renaming the column
df.columns = ['Lines']

for i in range(df.shape[0]):
   try:
        df['Lines'][i]= df['Lines'][i].decode('utf-8')
   except:
        df['Lines'][i]= df['Lines'][i].decode('windows-1252')  

df[['binary','date','data']]=df['Lines'].str.split('|',expand=True).apply(lambda x: x.str.strip())
我得到一个错误:

ValueError: Columns must be same length as key  

您可能会遇到这样一种情况:拆分没有为您提供3列:
“二进制”、“日期”、“数据”
在某些行中,如果数据被拆分,则二进制列或日期列没有数据