Python 访问csv文件第n行的最佳方式_Python_File_Csv_Python 3.x

Python 访问csv文件第n行的最佳方式

python file csv python-3.x

Python 访问csv文件第n行的最佳方式,python,file,csv,python-3.x,Python,File,Csv,Python 3.x,我必须访问CSV文件中的第n行以下是我所做的： import csv the_file = open('path', 'r') reader = csv.reader(the_file) N = input('What line do you need? > ') i = 0 for row in reader: if i == N: print("This is the line.") print(row) break

我必须访问CSV文件中的第n行

以下是我所做的：

import csv

the_file = open('path', 'r')
reader = csv.reader(the_file)

N = input('What line do you need? > ')
i = 0

for row in reader:
    if i == N:
        print("This is the line.")
        print(row)
        break

    i += 1

the_file.close()

……但这并不是最理想的。编辑精确性：如果文件很大，我不想遍历所有行，也不想将整个文件加载到内存中

我确实希望像

reader[N]

这样的东西存在，但我还没有找到它

编辑答案：这一行（来自所选答案）是我一直在寻找的：

next(itertools.islice(csv.reader(f), N, None)

您可以将

for

循环最小化为理解表达式，例如

row = [row for i,row in enumerate(reader) if i == N][0]  

# or even nicer as seen in iCodez code with next and generator expression

row = next(row for i,row in enumerate(reader) if i == N)

你的解决方案其实没那么糟糕。将文件迭代器推进到所需的行是一种很好的方法，在许多情况下都会用到

但是，如果您希望它更简洁，可以使用和：

如果找不到行（

太大），将返回其中的

None

）。不过，您可以选择任何其他值

您也可以使用打开文件以使其自动关闭：

import csv

with open('path', 'r') as the_file:
    reader = csv.reader(the_file)

    N = int(input('What line do you need? > '))

    line = next((x for i, x in enumerate(reader) if i == N), None)
    print(line)

如果你真的想减少尺寸，你可以：

from csv import reader
N = int(input('What line do you need? > '))
with open('path') as f:
    print(next((x for i, x in enumerate(reader(f)) if i == N), None))

虽然差别不大，但使用

枚举

比创建自己的计数器变量更简洁

for i, row in enumerate(reader):
    if i == N:
        print("This is the line.")
        print(row)
        break

您还可以使用

itertools.islice

，它是为这种类型的场景而设计的—访问iterable的特定片段而不将整个内容读入内存。它应该比循环遍历不需要的行更有效率

with open(path, 'r') as f:
    N = int(input('What line do you need? > '))
    print("This is the line.")
    print(next(itertools.islice(csv.reader(f), N, None)))

但是如果您的CSV文件很小，只需将整个内容读取到一个列表中，然后您就可以使用索引以正常方式访问该列表。这还有一个优点，即您可以随机访问多个不同的行，而无需重置csv读取器

my_csv_data = list(reader)
print(my_csv_data[N])

您可以简单地执行以下操作：

n = 2 # line to print
fd = open('foo.csv', 'r')
lines = fd.readlines()
print lines[n-1] # prints 2nd line
fd.close()

或者更好地利用更少的内存，不将整个文件加载到内存中：

import linecache
n = 2
linecache.getline('foo.csv', n)

itertools

模块具有许多用于创建专用迭代器的函数，其函数可用于轻松解决此问题：

import csv
import itertools

N = 5  # desired line number

with open('path.csv', newline='') as the_file:
    row = next(csv.reader(itertools.islice(the_file, N, N+1)))

print("This is the line.")
print(row)

另一方面，出于好奇，我最初的反应——也很有效（可以说更好）——是：

很好地使用了next！以前从未见过：-）+1在代码紧凑性方面是“最佳”的吗？可以执行

行=[读卡器中的行对行]

，然后执行

行[N]

。请注意，与其他一些答案一样，这需要阅读整个文件。这个问题似乎离题了，因为它是关于工作代码的优化，最适合于//codereview.stackexchange.com，正如奥利所说，无论代码看起来如何，您都从文档的位置0开始，然后转到位置x。这不像是一个数组，在这个数组中可以进行一些数学运算以快速跳转到正确的位置。@OllieFord

lines=list（reader）

更为惯用。@Veedrac Nice<代码>列表（读卡器）[N]如果以代码的紧凑性为目标，则可能是最佳选择。@OllieFord:谢谢您的观察

linecache

可以用作替代方案。我不知道

linecache

-这似乎是一个很好的解决方案！CSV中的多行字段如何？据我所知，CSV格式的文档允许在字段中换行，只要它们在双引号内（第2节，第6段），因此如果有任何可能，最好使用

CSV.reader

，如果您查看模块的源代码，您将看到，它也使用

file.readlines（）

将整个文件读入内存，因此不会占用更少的内存。它还带来了额外的开销，因为它用于从[其他]导入的模块中读取多行，而不是从数据文件中读取一行。如果您确定CSV将源行映射到CSV行1:1，请对文件进行切片（

islice（infle…）

）在将其传递到

csv.reader

@Veedrac之前，除了以后可能出现错误数据之外，您从中没有任何收获。。。我个人会坚持原著version@JonClements我的意思是，当你需要更快的访问时，可以把它作为一种选择；它不能保证是正确的，所以默认情况下我也不会使用它。@JonClements For

N=16…4096

在两个因素中，我的方法的速度优势是

[2,3,5,7,10,12,12,15,16]

@Veedrac我会花一点时间，但肯定不会想到这么大的区别！非常感谢您花时间回复一些数字。要使用

enumerate（）

正确获取行号，您可能想在调用中添加关键字参数

start=1

。@martineau可能是这样，但我保留它是为了匹配从0开始的OP代码。pythons csv reader有一个line_num属性，因此如果您不想，就不必使用enumerate。类似于。。。如果reader.line_num==N:dosomething@Veedrac谢谢，我使用dict基于d[k]的O（1）复杂度，但即使是l[I]也有相同的O（1）复杂度。

import csv
with open('cvs_file.csv', 'r') as inFile: 
    reader = csv.reader(inFile)
    my_content = list(reader)

line_no = input('What line do you need(line number begins from 0)? > ')
if line_no < len(my_content):
    print(my_content[line_no])
else:
    print('This line does not exists')

What line do you need? > 2
['101', '0.19', '1']

What line do you need? > 100
This line does not exists

import csv
import itertools

N = 5  # desired line number

with open('path.csv', newline='') as the_file:
    row = next(csv.reader(itertools.islice(the_file, N, N+1)))

print("This is the line.")
print(row)

    row = next(itertools.islice(csv.reader(the_file), N, N+1))