Java Linux、OSX和Android根据区域设置排序不一致
我试着为Android、Linux和OSX获得相同的排序顺序。 我将Linux和OSX的sort命令结果与 所以android上的自定义代码可以在类似的文件集上运行 在Linux/OSX上,我使用以下命令:Java Linux、OSX和Android根据区域设置排序不一致,java,android,bash,sorting,Java,Android,Bash,Sorting,我试着为Android、Linux和OSX获得相同的排序顺序。 我将Linux和OSX的sort命令结果与 所以android上的自定义代码可以在类似的文件集上运行 在Linux/OSX上,我使用以下命令: find {folder_name} -type f | sort 在java/android中,我使用的是这个-但是排序顺序 不对齐: private Enumeration<InputStream> getSortedStreams(HashMap<String,I
find {folder_name} -type f | sort
在java/android中,我使用的是这个-但是排序顺序
不对齐:
private Enumeration<InputStream> getSortedStreams(HashMap<String,InputStream> collection) {
Vector<InputStream> fileSreams = new Vector<>();
List<String> keys = new ArrayList(collection.keySet());
Collator collator = Collator.getInstance(Locale.US);//<<???
Collections.sort(keys,collator);
for (String key: keys) {
Log.d(TAG, "getSortedStreams: " + key);
fileSreams.add(collection.get(key));
}
return fileSreams.elements();
}
OSX输出:
1000/abc-d.txt
1000/abc_d.txt
我假设差异是因为使用的地区不同
对文件列表进行排序。据我所知,OSX和Linux都是
POSIX兼容,尽管Linux未经认证。Android也不兼容POSIX,但我想它在排序方面也不错
我下面有一些细节,试图解释清楚,并得到一致的结论
跨平台体验
看起来我可以控制Linux和Android对齐,但是OSX忽略了我设置的环境变量
我需要特定的帮助来设置区域设置,以便获得一致的结果
穿过平台
我还没有在IOS上做过测试,如果需要我可以提交
更多详细信息:
Locale locale = new Locale("en", "US", "POSIX");<<< the fix
Collator collator = Collator.getInstance(locale);
Collections.sort(keys,collator);
for (String key: keys) {
Log.d(TAG, "getSortedStreams: " + key);
fileStreams.add(collection.get(key));
}
/1000/abc-d.txt
/1000/abc_d.txt
//Locale locale = new Locale("en", "US", "POSIX");
Collator collator = Collator.getInstance(Locale.US);
Collections.sort(keys,collator);
for (String key: keys) {
Log.d(TAG, "getSortedStreams: " + key);
fileStreams.add(collection.get(key));
}
/1000/abc_d.txt
/1000/abc-d.txt
LANG=en_US.UTF-8
LC_CTYPE=UTF-8
LC_NUMERIC="en_US.UTF-8"
LC_TIME="en_US.UTF-8"
LC_COLLATE="en_US.UTF-8"
LC_MONETARY="en_US.UTF-8"
LC_MESSAGES="en_US.UTF-8"
LC_PAPER="en_US.UTF-8"
LC_NAME="en_US.UTF-8"
LC_ADDRESS="en_US.UTF-8"
LC_TELEPHONE="en_US.UTF-8"
LC_MEASUREMENT="en_US.UTF-8"
LC_IDENTIFICATION="en_US.UTF-8"
LC_ALL=
LANG=
LC_COLLATE="C"
LC_CTYPE="UTF-8"
LC_MESSAGES="C"
LC_MONETARY="C"
LC_NUMERIC="C"
LC_TIME="C"
LC_ALL=
find sort_test -type f | env -i LC_COLLATE=C sort
find sort_test -type f | env -i LC_COLLATE=C sort
Locale locale = new Locale("en", "US", "POSIX");<<< the fix
Collator collator = Collator.getInstance(locale);
Collections.sort(keys,collator);
在软呢帽芯上
测试用例:
在名为sort\u test的目录中创建两个具有以下名称的文件
sort_test/abc_d.txt
sort_test/abc-d.txt
在FedoraLinux Core 17-3.9.10-100.fc17.x86_64上
区域设置-适用于en_US的区域设置为:
locale -a | grep en_US
en_US
en_US.iso88591
en_US.iso885915
en_US.utf8
使用C
find sort_test/ -type f | env -i LC_COLLATE=C sort
sort_test/abc-d.txt
sort_test/abc_d.txt
find sort_test -type f | env -i LC_COLLATE=C sort
sort_test/abc-d.txt
sort_test/abc_d.txt
使用en_US.utf8
find sort_test/ -type f | env -i LC_COLLATE=en_US.utf8 sort
sort_test/abc_d.txt
sort_test/abc-d.txt
find sort_test -type f | env -i LC_COLLATE=en_US.UTF-8 sort
sort_test/abc-d.txt
sort_test/abc_d.txt
在OSX上-似乎很混乱,设置区域设置没有效果:
Locale locale = new Locale("en", "US", "POSIX");<<< the fix
Collator collator = Collator.getInstance(locale);
Collections.sort(keys,collator);
for (String key: keys) {
Log.d(TAG, "getSortedStreams: " + key);
fileStreams.add(collection.get(key));
}
/1000/abc-d.txt
/1000/abc_d.txt
//Locale locale = new Locale("en", "US", "POSIX");
Collator collator = Collator.getInstance(Locale.US);
Collections.sort(keys,collator);
for (String key: keys) {
Log.d(TAG, "getSortedStreams: " + key);
fileStreams.add(collection.get(key));
}
/1000/abc_d.txt
/1000/abc-d.txt
LANG=en_US.UTF-8
LC_CTYPE=UTF-8
LC_NUMERIC="en_US.UTF-8"
LC_TIME="en_US.UTF-8"
LC_COLLATE="en_US.UTF-8"
LC_MONETARY="en_US.UTF-8"
LC_MESSAGES="en_US.UTF-8"
LC_PAPER="en_US.UTF-8"
LC_NAME="en_US.UTF-8"
LC_ADDRESS="en_US.UTF-8"
LC_TELEPHONE="en_US.UTF-8"
LC_MEASUREMENT="en_US.UTF-8"
LC_IDENTIFICATION="en_US.UTF-8"
LC_ALL=
LANG=
LC_COLLATE="C"
LC_CTYPE="UTF-8"
LC_MESSAGES="C"
LC_MONETARY="C"
LC_NUMERIC="C"
LC_TIME="C"
LC_ALL=
find sort_test -type f | env -i LC_COLLATE=C sort
find sort_test -type f | env -i LC_COLLATE=C sort
Locale locale = new Locale("en", "US", "POSIX");<<< the fix
Collator collator = Collator.getInstance(locale);
Collections.sort(keys,collator);
local-a给出了一个地区列表,en_US地区包括:
en_US
en_US.ISO8859-1
en_US.ISO8859-15
en_US.US-ASCII
en_US.UTF-8
使用C
find sort_test/ -type f | env -i LC_COLLATE=C sort
sort_test/abc-d.txt
sort_test/abc_d.txt
find sort_test -type f | env -i LC_COLLATE=C sort
sort_test/abc-d.txt
sort_test/abc_d.txt
使用en_US.UTF-8
find sort_test/ -type f | env -i LC_COLLATE=en_US.utf8 sort
sort_test/abc_d.txt
sort_test/abc-d.txt
find sort_test -type f | env -i LC_COLLATE=en_US.UTF-8 sort
sort_test/abc-d.txt
sort_test/abc_d.txt
在Android上,我将区域设置为使用POSIX区域设置:
Locale locale = new Locale("en", "US", "POSIX");<<< the fix
Collator collator = Collator.getInstance(locale);
Collections.sort(keys,collator);
for (String key: keys) {
Log.d(TAG, "getSortedStreams: " + key);
fileStreams.add(collection.get(key));
}
/1000/abc-d.txt
/1000/abc_d.txt
//Locale locale = new Locale("en", "US", "POSIX");
Collator collator = Collator.getInstance(Locale.US);
Collections.sort(keys,collator);
for (String key: keys) {
Log.d(TAG, "getSortedStreams: " + key);
fileStreams.add(collection.get(key));
}
/1000/abc_d.txt
/1000/abc-d.txt
LANG=en_US.UTF-8
LC_CTYPE=UTF-8
LC_NUMERIC="en_US.UTF-8"
LC_TIME="en_US.UTF-8"
LC_COLLATE="en_US.UTF-8"
LC_MONETARY="en_US.UTF-8"
LC_MESSAGES="en_US.UTF-8"
LC_PAPER="en_US.UTF-8"
LC_NAME="en_US.UTF-8"
LC_ADDRESS="en_US.UTF-8"
LC_TELEPHONE="en_US.UTF-8"
LC_MEASUREMENT="en_US.UTF-8"
LC_IDENTIFICATION="en_US.UTF-8"
LC_ALL=
LANG=
LC_COLLATE="C"
LC_CTYPE="UTF-8"
LC_MESSAGES="C"
LC_MONETARY="C"
LC_NUMERIC="C"
LC_TIME="C"
LC_ALL=
find sort_test -type f | env -i LC_COLLATE=C sort
find sort_test -type f | env -i LC_COLLATE=C sort
Locale locale = new Locale("en", "US", "POSIX");<<< the fix
Collator collator = Collator.getInstance(locale);
Collections.sort(keys,collator);
LINUX语言环境变量为:语言环境命令输出:
Locale locale = new Locale("en", "US", "POSIX");<<< the fix
Collator collator = Collator.getInstance(locale);
Collections.sort(keys,collator);
for (String key: keys) {
Log.d(TAG, "getSortedStreams: " + key);
fileStreams.add(collection.get(key));
}
/1000/abc-d.txt
/1000/abc_d.txt
//Locale locale = new Locale("en", "US", "POSIX");
Collator collator = Collator.getInstance(Locale.US);
Collections.sort(keys,collator);
for (String key: keys) {
Log.d(TAG, "getSortedStreams: " + key);
fileStreams.add(collection.get(key));
}
/1000/abc_d.txt
/1000/abc-d.txt
LANG=en_US.UTF-8
LC_CTYPE=UTF-8
LC_NUMERIC="en_US.UTF-8"
LC_TIME="en_US.UTF-8"
LC_COLLATE="en_US.UTF-8"
LC_MONETARY="en_US.UTF-8"
LC_MESSAGES="en_US.UTF-8"
LC_PAPER="en_US.UTF-8"
LC_NAME="en_US.UTF-8"
LC_ADDRESS="en_US.UTF-8"
LC_TELEPHONE="en_US.UTF-8"
LC_MEASUREMENT="en_US.UTF-8"
LC_IDENTIFICATION="en_US.UTF-8"
LC_ALL=
LANG=
LC_COLLATE="C"
LC_CTYPE="UTF-8"
LC_MESSAGES="C"
LC_MONETARY="C"
LC_NUMERIC="C"
LC_TIME="C"
LC_ALL=
find sort_test -type f | env -i LC_COLLATE=C sort
find sort_test -type f | env -i LC_COLLATE=C sort
Locale locale = new Locale("en", "US", "POSIX");<<< the fix
Collator collator = Collator.getInstance(locale);
Collections.sort(keys,collator);
OSX语言环境变量为:语言环境命令输出:
Locale locale = new Locale("en", "US", "POSIX");<<< the fix
Collator collator = Collator.getInstance(locale);
Collections.sort(keys,collator);
for (String key: keys) {
Log.d(TAG, "getSortedStreams: " + key);
fileStreams.add(collection.get(key));
}
/1000/abc-d.txt
/1000/abc_d.txt
//Locale locale = new Locale("en", "US", "POSIX");
Collator collator = Collator.getInstance(Locale.US);
Collections.sort(keys,collator);
for (String key: keys) {
Log.d(TAG, "getSortedStreams: " + key);
fileStreams.add(collection.get(key));
}
/1000/abc_d.txt
/1000/abc-d.txt
LANG=en_US.UTF-8
LC_CTYPE=UTF-8
LC_NUMERIC="en_US.UTF-8"
LC_TIME="en_US.UTF-8"
LC_COLLATE="en_US.UTF-8"
LC_MONETARY="en_US.UTF-8"
LC_MESSAGES="en_US.UTF-8"
LC_PAPER="en_US.UTF-8"
LC_NAME="en_US.UTF-8"
LC_ADDRESS="en_US.UTF-8"
LC_TELEPHONE="en_US.UTF-8"
LC_MEASUREMENT="en_US.UTF-8"
LC_IDENTIFICATION="en_US.UTF-8"
LC_ALL=
LANG=
LC_COLLATE="C"
LC_CTYPE="UTF-8"
LC_MESSAGES="C"
LC_MONETARY="C"
LC_NUMERIC="C"
LC_TIME="C"
LC_ALL=
find sort_test -type f | env -i LC_COLLATE=C sort
find sort_test -type f | env -i LC_COLLATE=C sort
Locale locale = new Locale("en", "US", "POSIX");<<< the fix
Collator collator = Collator.getInstance(locale);
Collections.sort(keys,collator);
目前似乎对我有效的解决方案是将所有操作系统与OSX对齐 Linux:
Locale locale = new Locale("en", "US", "POSIX");<<< the fix
Collator collator = Collator.getInstance(locale);
Collections.sort(keys,collator);
for (String key: keys) {
Log.d(TAG, "getSortedStreams: " + key);
fileStreams.add(collection.get(key));
}
/1000/abc-d.txt
/1000/abc_d.txt
//Locale locale = new Locale("en", "US", "POSIX");
Collator collator = Collator.getInstance(Locale.US);
Collections.sort(keys,collator);
for (String key: keys) {
Log.d(TAG, "getSortedStreams: " + key);
fileStreams.add(collection.get(key));
}
/1000/abc_d.txt
/1000/abc-d.txt
LANG=en_US.UTF-8
LC_CTYPE=UTF-8
LC_NUMERIC="en_US.UTF-8"
LC_TIME="en_US.UTF-8"
LC_COLLATE="en_US.UTF-8"
LC_MONETARY="en_US.UTF-8"
LC_MESSAGES="en_US.UTF-8"
LC_PAPER="en_US.UTF-8"
LC_NAME="en_US.UTF-8"
LC_ADDRESS="en_US.UTF-8"
LC_TELEPHONE="en_US.UTF-8"
LC_MEASUREMENT="en_US.UTF-8"
LC_IDENTIFICATION="en_US.UTF-8"
LC_ALL=
LANG=
LC_COLLATE="C"
LC_CTYPE="UTF-8"
LC_MESSAGES="C"
LC_MONETARY="C"
LC_NUMERIC="C"
LC_TIME="C"
LC_ALL=
find sort_test -type f | env -i LC_COLLATE=C sort
find sort_test -type f | env -i LC_COLLATE=C sort
Locale locale = new Locale("en", "US", "POSIX");<<< the fix
Collator collator = Collator.getInstance(locale);
Collections.sort(keys,collator);
OSX:
Locale locale = new Locale("en", "US", "POSIX");<<< the fix
Collator collator = Collator.getInstance(locale);
Collections.sort(keys,collator);
for (String key: keys) {
Log.d(TAG, "getSortedStreams: " + key);
fileStreams.add(collection.get(key));
}
/1000/abc-d.txt
/1000/abc_d.txt
//Locale locale = new Locale("en", "US", "POSIX");
Collator collator = Collator.getInstance(Locale.US);
Collections.sort(keys,collator);
for (String key: keys) {
Log.d(TAG, "getSortedStreams: " + key);
fileStreams.add(collection.get(key));
}
/1000/abc_d.txt
/1000/abc-d.txt
LANG=en_US.UTF-8
LC_CTYPE=UTF-8
LC_NUMERIC="en_US.UTF-8"
LC_TIME="en_US.UTF-8"
LC_COLLATE="en_US.UTF-8"
LC_MONETARY="en_US.UTF-8"
LC_MESSAGES="en_US.UTF-8"
LC_PAPER="en_US.UTF-8"
LC_NAME="en_US.UTF-8"
LC_ADDRESS="en_US.UTF-8"
LC_TELEPHONE="en_US.UTF-8"
LC_MEASUREMENT="en_US.UTF-8"
LC_IDENTIFICATION="en_US.UTF-8"
LC_ALL=
LANG=
LC_COLLATE="C"
LC_CTYPE="UTF-8"
LC_MESSAGES="C"
LC_MONETARY="C"
LC_NUMERIC="C"
LC_TIME="C"
LC_ALL=
find sort_test -type f | env -i LC_COLLATE=C sort
find sort_test -type f | env -i LC_COLLATE=C sort
Locale locale = new Locale("en", "US", "POSIX");<<< the fix
Collator collator = Collator.getInstance(locale);
Collections.sort(keys,collator);
Android:
Locale locale = new Locale("en", "US", "POSIX");<<< the fix
Collator collator = Collator.getInstance(locale);
Collections.sort(keys,collator);
for (String key: keys) {
Log.d(TAG, "getSortedStreams: " + key);
fileStreams.add(collection.get(key));
}
/1000/abc-d.txt
/1000/abc_d.txt
//Locale locale = new Locale("en", "US", "POSIX");
Collator collator = Collator.getInstance(Locale.US);
Collections.sort(keys,collator);
for (String key: keys) {
Log.d(TAG, "getSortedStreams: " + key);
fileStreams.add(collection.get(key));
}
/1000/abc_d.txt
/1000/abc-d.txt
LANG=en_US.UTF-8
LC_CTYPE=UTF-8
LC_NUMERIC="en_US.UTF-8"
LC_TIME="en_US.UTF-8"
LC_COLLATE="en_US.UTF-8"
LC_MONETARY="en_US.UTF-8"
LC_MESSAGES="en_US.UTF-8"
LC_PAPER="en_US.UTF-8"
LC_NAME="en_US.UTF-8"
LC_ADDRESS="en_US.UTF-8"
LC_TELEPHONE="en_US.UTF-8"
LC_MEASUREMENT="en_US.UTF-8"
LC_IDENTIFICATION="en_US.UTF-8"
LC_ALL=
LANG=
LC_COLLATE="C"
LC_CTYPE="UTF-8"
LC_MESSAGES="C"
LC_MONETARY="C"
LC_NUMERIC="C"
LC_TIME="C"
LC_ALL=
find sort_test -type f | env -i LC_COLLATE=C sort
find sort_test -type f | env -i LC_COLLATE=C sort
Locale locale = new Locale("en", "US", "POSIX");<<< the fix
Collator collator = Collator.getInstance(locale);
Collections.sort(keys,collator);
Locale=newlocale(“en”、“US”、“POSIX”);他们都应该按字母顺序排列。你能给出它们都产生的输出吗?另外,也许你应该在返回它之前对文件流进行排序,而不是键
。没有理由对键进行排序
,因为您只需按原始顺序从HashMap中获取值。我已使用一些输出更新了问题。看起来Android和Bash的处理方式不一样。正如我在第二条评论中所说的,我不认为你在添加到向量之前真的在排序任何东西。HashMap顺序是不能保证的,所以不管键的顺序是什么,您真正关心的是值。我按如下方式阅读代码:我传入一组字符串,即文件名和实际文件内容。然后,我根据文件名对集合进行排序,然后检索相应的文件内容,并按排序顺序添加到最初为空的fileStreams集合中。