Python:尝试搜索csv第一列,如果len>;30在同一行的另一个文件中

Python:尝试搜索csv第一列,如果len>;30在同一行的另一个文件中,python,csv,Python,Csv,标题不够大,我无法解释这一点,所以它是这样的: 我有一个csv文件,看起来像这样: 示例csv包含 long string with some special characters , number, string, number long string with some special characters , number, string, number long string with some special characters , number, string, number lo

标题不够大,我无法解释这一点,所以它是这样的:

我有一个csv文件,看起来像这样:

示例csv包含

long string with some special characters , number, string, number
long string with some special characters , number, string, number
long string with some special characters , number, string, number
long string with some special characters , number, string, number
我想浏览第一列,如果字符串的长度大于20,请执行以下操作:

第20行:带有som,e特殊字符的长字符串

拆分字符串,用字符串的第一部分修改第一个csv,然后创建一个新的csv,并将另一部分添加到同一行号上,剩下的只是空白


我现在拥有的是: 下面的内容现在没有任何作用,我只是试着向自己解释一下,并找出如何使用splitString编写新文件

fileName = file name 
maxCollumnLength = number of rows in the whole set 
lineNum = line number of a string that is greater then 20  
splitString = second part of the split string that should be written on another file


def newopenfile(fileName, maxCollumnLength, lineNum, splitString):
    with open(fileName, 'rw', encoding="utf8") as nf:
        writer = csv.writer(fileName, quoting=csv.QUOTE_NONE)
        for i in range(0, maxCollumnLength-1):
            #write whitespace until reaching lineNum of a string thats bigger then 20 then write that part of the string to a csv
这将通过第一列并检查长度

fileName = 'uskrs.csv'
firstColList=[]         #an empty list to store the second column
splitString=[]
i = 0
with open(fileName, 'rw', encoding="utf8") as rf:
    reader = csv.reader(rf, delimiter=',')
    for row in reader:
        if len(row[0]) > 20:
            i+=1
            #split row and parse the other end of the row to newopenfile(fileName, len(reader), i, splitString )
            #print(row[0])
        #for debuging    
        firstColList.append(row[0])  
从这一点上,我被困在如何实际更改csv中的字符串以及如何拆分它们

字符串也可能有60多个字符,因此需要拆分2次以上,并将其存储在2个以上的CSV中

我不擅长解释这个问题,所以如果你有任何问题,请尽管问

好的,我成功地划分了第一列,如果它的长度大于20,并用前20个字符替换第一列

import csv

def checkLength(column, readFile, writeFile, maxLen):
    counter = 0
    i = 0
    idxSplitItems = []
    final = []
    newSplits = 0
    with open(readFile,'r', encoding="utf8", newline='') as f:
        reader = csv.reader(f)
        your_list = list(reader)
        final = your_list
        for sublist in your_list:
            #del sublist[-1]    -remove last invisible element
            i+=1
            data = removeUnwanted(sublist[column])
            print(data)
            if len(data) > maxLen:
                counter += 1 # Number of large
                idxSplitItems.append(split_bylen(i,data,maxLen))
                if len(idxSplitItems) > newSplits: newSplits = len(idxSplitItems)
                final[i-1][column] = split_bylen(i,data,maxLen)[1]
                final[i-1][column] = removeUnwanted(final[i-1][column])
            print("After split data: "+ data)
            print("After split final: "+ final[i-1][column])   

    writeSplitToCSV(writeFile, final)
    checkCols(final, 6)
    return final, idxSplitItems
def removeUnwanted(data):
    data = data.replace(',',' ')
    return data

def split_bylen(index, item, maxLen):
    clean = removeUnwanted(item)
    splitList = [clean[ind:ind+maxLen] for ind in range(0, len(item), maxLen)]
    splitList.insert(0,index)
    return splitList

def writeSplitToCSV(writeFile,data):
    with open(writeFile,'w', encoding="utf8", newline='') as f:
        writer = csv.writer(f)
        writer.writerows(data)

def checkCols(data, columns):
    for sublist in data:
        if len(sublist)-1!=columns:
            print ("[X] This row doesnt have the same amount of columns as others: "+sublist)
        else:
            print("All okay")
#len(data) #how many split items
#print(your_list[0][0])
#print("Number of large: ", counter)

final, idxSplitItems = checkLength(0,'test.csv','final.csv', 30)
print("------------------------")
print(idxSplitItems)
print("-------------------------")
print(final)
现在我对这部分代码有一个问题,请注意:

print("After split data: "+ data)
print("After split final: "+ final[i-1][column]) 
这是为了检查删除逗号是否有效

“BUTKOVIĆVESNA,DIPL.IUR。”

数据返回

布特科维·维斯纳

但是最后的回报

BUTKOVIĆVESNA,DIPL.IUR

为什么我的最后一次返回“,”但在数据中它已经消失了,一定是在“split_bylen()”中做的某件事让它这样做的,字典很有趣

要覆盖原始csv,请参阅。你必须使用听写器和听写器。我保留你的阅读方法只是为了清楚

writecsvs = {} #store each line of each new csv 
# e.g. {'csv1':[[row0_split1,row0_num,row0_str,row0_num],[row1_split1,row1_num,row1_str,row1_num],...],
#       'csv2':[[row0_split2,row0_num,row0_str,row0_num],[row1_split2,row1_num,row1_str,row1_num],...],
#       .
#       .
#       .}

with open(fileName, mode='rw', encoding="utf-8-sig") as rf:
    reader = csv.reader(rf, delimiter=',')
    for row in reader:            
        col1 = row[0]
        # check size & split
        # decide number of new csvs
        # overwrite original csv
        # store new content in writecsvs dict

for # Loop over each csv in writecsvs:
    writelines = # Get List of Lines
    out_file = open('csv1.csv', mode='w') # use the keys in writecsvs for filenames
    for line in writelines:
        out_file.write(line)

希望这能有所帮助。

我已经努力了,找到了一个可以解决的办法。。。。我做了:1。我可以在csv第一列中搜索长度超过30个字符2的字符串。我可以每30个字符拆分一个长字符串,并将它们存储在列表3中。我可以用正确的信息编写一个新的csv,但字符串较短的问题是:它不会删除字符串中的“,”(不是csv所需的“,”),但我创建了一个函数,可以删除逗号(如果存在的话)