用python在OSX上使用HFS+；如何获得现有文件名的正确大小写？_Python_Macos_Filesystems_Hfs+

用python在OSX上使用HFS+；如何获得现有文件名的正确大小写？

python macos filesystems

用python在OSX上使用HFS+；如何获得现有文件名的正确大小写？,python,macos,filesystems,hfs+,Python,Macos,Filesystems,Hfs+,我正在存储有关OSX HFS+文件系统上存在的文件的数据。稍后我想迭代存储的数据，并确定每个文件是否仍然存在。出于我的目的，我关心文件名的大小写敏感度，所以如果文件名的情况发生了变化，我会认为文件已经不存在了。我从尝试开始 os.path.isfile(filename) 但在HFS+上正常安装OSX时，即使文件名大小写不匹配，也会返回True。我正在寻找一种方法来编写一个isfile（）函数，即使文件系统不关心大小写，它也会关心大小写 os.path.normcase（）和os.path.

我正在存储有关OSX HFS+文件系统上存在的文件的数据。稍后我想迭代存储的数据，并确定每个文件是否仍然存在。出于我的目的，我关心文件名的大小写敏感度，所以如果文件名的情况发生了变化，我会认为文件已经不存在了。我从尝试开始

os.path.isfile(filename)

但在HFS+上正常安装OSX时，即使文件名大小写不匹配，也会返回True。我正在寻找一种方法来编写一个isfile（）函数，即使文件系统不关心大小写，它也会关心大小写

os.path.normcase（）和os.path.realpath（）在我传入它们的任何情况下都返回文件名

编辑：

我现在有两个函数，似乎可以处理限于ASCII的文件名。我不知道unicode或其他字符会如何影响这一点

第一个是基于omz和Alex L给出的答案

def does_file_exist_case_sensitive1a(fname):
    if not os.path.isfile(fname): return False
    path, filename = os.path.split(fname)
    search_path = '.' if path == '' else path
    for name in os.listdir(search_path):
        if name == filename : return True
    return False

第二种可能效率更低

def does_file_exist_case_sensitive2(fname):
    if not os.path.isfile(fname): return False
    m = re.search('[a-zA-Z][^a-zA-Z]*\Z', fname)
    if m:
        test = string.replace(fname, fname[m.start()], '?', 1)
        print test
        actual = glob.glob(test)
        return len(actual) == 1 and actual[0] == fname
    else:
        return True  # no letters in file, case sensitivity doesn't matter

下面是第三个基于DSM的答案

def does_file_exist_case_sensitive3(fname):
    if not os.path.isfile(fname): return False
    path, filename = os.path.split(fname)
    search_path = '.' if path == '' else path
    inodes = {os.stat(x).st_ino: x for x in os.listdir(search_path)}
    return inodes[os.stat(fname).st_ino] == filename

如果我在一个目录中有数千个文件，我不希望这些文件会运行良好。我仍然希望能找到更有效的方法

def does_file_exist_case_sensitive2(fname):
    if not os.path.isfile(fname): return False
    m = re.search('[a-zA-Z][^a-zA-Z]*\Z', fname)
    if m:
        test = string.replace(fname, fname[m.start()], '?', 1)
        print test
        actual = glob.glob(test)
        return len(actual) == 1 and actual[0] == fname
    else:
        return True  # no letters in file, case sensitivity doesn't matter

我在测试这些文件时注意到的另一个缺点是，它们只检查文件名的大小写是否匹配。如果我给他们传递一个包含目录名的路径，到目前为止，这些函数都没有检查目录名的大小写。

您可以使用类似于

os.listdir

的方法，并检查列表是否包含您要查找的文件名。

以下是omz的帖子中的内容-类似这样的内容可能会起作用：

import os

def getcase(filepath):
    path, filename = os.path.split(filepath)
    for fname in os.listdir(path):
        if filename.lower() == fname.lower():
            return os.path.join(path, fname)

print getcase('/usr/myfile.txt')

我有一个疯狂的想法。免责声明：我对文件系统的了解不足以考虑边缘情况，所以只把它当作是发生的事情。一次

>>> !ls
A.txt   b.txt
>>> inodes = {os.stat(x).st_ino: x for x in os.listdir(".")}
>>> inodes
{80827580: 'A.txt', 80827581: 'b.txt'}
>>> inodes[os.stat("A.txt").st_ino]
'A.txt'
>>> inodes[os.stat("a.txt").st_ino]
'A.txt'
>>> inodes[os.stat("B.txt").st_ino]
'b.txt'
>>> inodes[os.stat("b.txt").st_ino]
'b.txt'

您也可以尝试打开该文件

    try:open('test', 'r')
    except IOError: print 'File does not exist'

这个答案只是一个概念证明，因为它不尝试转义特殊字符、处理非ASCII字符或处理文件系统编码问题

从好的方面来说，答案并不涉及在Python中循环文件，它可以正确地处理检查最终路径段之前的目录名

此建议基于以下观察结果（至少在使用bash时），即当且仅当

/my/path

具有确切的大小写时，以下命令才能无误地找到路径

/my/path

$ ls /[m]y/[p]ath

（如果任何路径零件中没有支架，则该零件对套管的变化不敏感。）

以下是基于此思想的示例函数：

import os.path
import subprocess

def does_exist(path):
    """Return whether the given path exists with the given casing.

    The given path should begin with a slash and not end with a trailing
    slash.  This function does not attempt to escape special characters
    and does not attempt to handle non-ASCII characters, file system
    encodings, etc.
    """
    parts = []
    while True:
        head, tail = os.path.split(path)
        if tail:
            parts.append(tail)
            path = head
        else:
            assert head == '/'
            break
    parts.reverse()
    # For example, for path "/my/path", pattern is "/[m]y/[p]ath".
    pattern = "/" + "/".join(["[%s]%s" % (p[0], p[1:]) for p in parts])
    cmd = "ls %s" % pattern
    return_code = subprocess.call(cmd, shell=True)
    return not return_code

此答案补充了现有答案，提供了功能，改编自：

还可以使用非ASCII字符
处理所有路径组件（不仅仅是最后一个）
使用Python2.x和3.x
另外，还可以在Windows上工作（有更好的Windows特定解决方案-请参阅-但是这里的功能是跨平台的，不需要额外的软件包）

导入操作系统，Unicode数据
def gettruecasepath（path）：#重要提示：必须是Unicode字符串
如果不是os.path.lexists（path）：#使用lexists也可以查找断开的符号链接
引发OSError（2，u'没有这样的文件或目录'，路径）
isosx=sys.platform==u'darwin'
如果isosx:#转换为NFD以与os.listdir（）结果进行比较
path=unicodedata.normalize（'NFD'，path）
parentpath，leaf=os.path.split（路径）
#查找叶组件的真实情况
如果叶不在[u'，u'..]:#跳过。和。。组件
leaf_lower=leaf.lower（）#如果使用Py3.3+：将.lower（）更改为.casefold（）
发现=错误
对于os.listdir中的叶（u'.'如果parentpath==u''，则为else parentpath）：
如果leaf_lower==leaf.lower（）：#请参阅上面的.casefold（）注释
找到=真
如果是isosx：
leaf=unicodedata.normalize（'NFC'，leaf）#转换为NFC作为返回值
打破
如果未找到：
#仅当路径刚被删除时才应发生
raise OSError（2，在'+parentpath，leaf_lower'中意外找不到u'）
#在父路径上递归
如果父路径不在[u'、u'.'、u'..、u'/'、u'\\']和\
非（sys.platform==u'win32'和
[u'\\'，u'/']中的os.path.splitdrive（parentpath）[1]：
parentpath=gettruecasepath（parentpath）#递归
返回os.path.join（parentpath，leaf）
def istruecasepath（路径）：#重要提示：必须是Unicode字符串
return gettruecasepath（path）=unicodedata.normalize（'NFC'，path）

```
gettruecasepath（）
```
获取存储在指定路径（绝对或相对）的文件系统中的大小写精确表示，如果它存在：
- 输入路径必须是Unicode字符串：
  - Python3.x：字符串本机是Unicode的-不需要额外的操作
  - Python2.x：文本：前缀为
```
u
```
    ；e、例如，
```
u'Motörhead'
```
    ；str变量：使用转换，例如，
```
strVar.decode（'utf8'）
```
- 返回的字符串是NFC格式的Unicode字符串（标准格式）。即使在OSX上，NFC也会返回，在OSX中，文件系统（HFS+）以NFD（分解的标准形式）存储名称。
  NFC被返回，因为它比NFD和Python更常见不将等效的NFC和NFD字符串识别为（概念上）相同。背景信息见下文
- 返回的路径保留输入路径的结构（相对与绝对，组件如
  和
```
.
```
  ），但多个路径分隔符被折叠，并且在Windows上，返回的路径始终使用
```
\
```
  作为路径分隔符
- 在Windows上，驱动器/UNC共享组件（如果存在）将按原样保留
- 如果路径不存在或您没有访问该路径的权限，则会引发
```
OSError
```
  异常
- 如果在ca上使用此函数