C 如何按字母顺序对字符串数组排序(区分大小写,非标准排序)
我需要一个c语言代码来对一些字符串进行排序,它应该区分大小写,并且对于大写和小写的同一个字母,小写必须放在第一位。例如,以下字符串的排序结果:C 如何按字母顺序对字符串数组排序(区分大小写,非标准排序),c,sorting,case-sensitive,alphabetical,C,Sorting,Case Sensitive,Alphabetical,我需要一个c语言代码来对一些字符串进行排序,它应该区分大小写,并且对于大写和小写的同一个字母,小写必须放在第一位。例如,以下字符串的排序结果: eggs bacon cheese Milk spinach potatoes milk spaghetti 应该是: bacon cheese eggs milk Milk potatoes spaghetti spinach 我已经编写了一个代码,但得到的结果是: Milk bacon cheese eggs milk potatoes spag
eggs
bacon
cheese
Milk
spinach
potatoes
milk
spaghetti
应该是:
bacon
cheese
eggs
milk
Milk
potatoes
spaghetti
spinach
我已经编写了一个代码,但得到的结果是:
Milk
bacon
cheese
eggs
milk
potatoes
spaghetti
spinach
我不知道如何改进这个,我已经搜索了很多。有人能帮我吗
#include <stdio.h>
#include <string.h>
int main(){
char c;
char name[20][10], temp[10];
int count_name = 0;
int name_index = 0;
int i, j;
while ((c = getchar()) != EOF){
if (c == 10){
name[count_name][name_index] = '\0';
count_name++;
name_index = 0;
} else {
name[count_name][name_index] = c;
name_index++;
}
}
for(i=0; i < count_name-1 ; i++){
for(j=i+1; j< count_name; j++)
{
if(strcmp(name[i],name[j]) > 0)
{
strcpy(temp,name[i]);
strcpy(name[i],name[j]);
strcpy(name[j],temp);
}
}
}
for (i = 0; i < count_name; i++){
printf("%s\n", name[i]);
}
}
#包括
#包括
int main(){
字符c;
字符名[20][10],临时[10];
int count_name=0;
int name_index=0;
int i,j;
而((c=getchar())!=EOF){
如果(c==10){
名称[计数名称][名称索引]='\0';
count_name++;
name_index=0;
}否则{
名称[计数名称][名称索引]=c;
name_index++;
}
}
对于(i=0;i0)
{
strcpy(临时,名称[i]);
strcpy(名称[i],名称[j]);
strcpy(名称[j],温度);
}
}
}
对于(i=0;i
您可以编写用于排序的自定义比较函数
首先,查看默认排序顺序:
#include <stdio.h>
#include <stdlib.h>
#include <ctype.h>
#include <string.h>
const char *tgt[]={
"bacon", "Bacon", "mIlk", "Milk", "spinach", "MILK", "milk", "eggs"
};
int tgt_size=8;
static int cmp(const void *p1, const void *p2){
return strcmp(* (char * const *) p1, * (char * const *) p2);
}
int main(int argc, char *argv[]) {
printf("Before sort:\n\t");
for(int n=0; n<tgt_size; n++)
printf("%s ", tgt[n]);
qsort(tgt, tgt_size, sizeof(char *), cmp);
printf("\nAfter sort:\n\t");
for(int n=0; n<tgt_size; n++)
printf("%s ", tgt[n]);
return 0;
}
我们可以编写自己的比较函数,在cmp
中使用,在qsort
中使用,忽略大小写。看起来是这样的:
int mycmp(const char *a, const char *b) {
const char *cp1 = a, *cp2 = b;
for (; toupper(*cp1) == toupper(*cp2); cp1++, cp2++)
if (*cp1 == '\0')
return 0;
return ((toupper(*cp1) < toupper(*cp2)) ? -1 : +1);
}
案例忽略版本现在打印:
Before sort:
bacon Bacon mIlk Milk spinach MILK milk eggs
After sort:
bacon Bacon eggs Milk MILK milk mIlk spinach
这与POSIX函数的输出相同
函数mycmp
首先按照正常顺序对词典进行比较[a | a]-[z | z]
。这意味着你将得到类似字母的单词,但你可能会得到bacon,bacon
,就像bacon,bacon
一样。这是因为qsort不是a,“培根”比较起来等于“培根”
现在我们要做的是,如果在忽略大小写的情况下比较为0(即“MILK”和“MILK”等相同的单词),则现在比较包括大小写,并颠倒顺序:
int mycmp(const char *a, const char *b) {
const char *cp1 = a, *cp2 = b;
int sccmp=1;
for (; toupper(*cp1) == toupper(*cp2); cp1++, cp2++)
if (*cp1 == '\0')
sccmp = 0;
if (sccmp) return ((toupper(*cp1) < toupper(*cp2)) ? -1 : +1);
for (; *a == *b; a++, b++)
if (*a == '\0')
return 0;
return ((*a < *b) ? +1 : -1);
}
不幸的是,对于UNICODE,这种方法变得很难使用。对于复杂类型,考虑使用映射或多步排序,使用.< /P>
对于复杂和位置敏感的字母排序,请考虑。例如,在不同的位置,字母的字母顺序不同:
Swedish z < ö
y == w
German ö < z
Danish Z < Å
Lithuanian i < y < k
Tr German ä == æ
Tr Spanish c < ch < d
German Dictionary Sort: of < öf
German Phonebook Sort: öf < of
瑞典语z<
y==w
德文
丹麦Z<Å
立陶宛语i
这些区别的默认值在默认Unicode排序元素表()中捕获,该表为Unicode排序和字符串比较提供了默认映射。您可以修改默认设置以捕获字典排序和电话簿排序之间的区别、不同位置或不同的大小写处理方式。在Unicode公共区域设置数据存储库()中主动跟踪各个位置的变化
多级排序的reccomendation是分层的:
Level Description Examples
L1 Base characters role < roles < rule
L2 Accents role < rôle < roles
L3 Case/Variants role < Role < rôle
L4 Punctuation role < “role” < Role
Ln Identical role < ro□le < “role”
级别描述示例
L1基本字符角色<角色<规则
二语强调角色
中有一个广泛使用的Unicode排序规则实现。几个示例的默认DUCET排序规则为:
b < B < bad < Bad < bäd
c < C < cote < coté < côte < côté
b
您可以浏览ICU库,并使用
如果您想实现自己版本的DUCET for giggles,可以遵循本文中使用的一般方法。这不是压倒性的,但也不是微不足道的。操作代码的关键是使用函数
strcmp()
来比较两个字符串。因此,我将首先用另一个函数替换此标准函数,如下所示:
// We assume that the collating sequence satisfies the following rules:
// 'A' < 'B' < 'C' < ...
// 'a' < 'b' < 'c' < ...
// We don't make any other assumptions.
#include <ctype.h>
int my_strcmp(const char * s1, const char * s2)
{
const char *p1 = s1, *p2 = s2;
while(*p1 == *p2 && *p1 != '\0' && *p2 != '\0')
p1++, p2++; /* keep searching... */
if (*p1 == *p2)
return 0;
if (*p1 == '\0')
return -1;
if (*p2 == '\0')
return +1;
char c1 = tolower(*p1), c2 = tolower(*p2);
int u1 = isupper(*p1) != 0, u2 = isupper(*p2) != 0;
if (c1 != c2)
return c1 - c2; // <<--- Alphabetical order assumption is used here
if (c1 == c2)
return u1 - u2;
}
现在,通过将strcmp()
替换为my\u strcmp()
您将获得所需的结果
在排序算法中,最好分别考虑其3个主要方面:
- 比较函数李>
- 我们将使用的抽象排序算法李>
- 当需要交换两个项目时,数据将在阵列中“移动”的方式李>
因此,例如,一旦比较函数得到很好的解决,下一个优化步骤可能是用一个更有效的排序算法代替double for排序算法,如快速排序
特别是,标准库
的函数qsort()
为您提供了这样一种算法,因此您无需关心它的编程。最后,用于存储阵列信息的策略可能会影响性能。
存储“指向字符的指针数组”这样的字符串比存储“字符数组数组”更有效,因为交换指针比交换两个完整的字符数组更快
附加说明:前三个
if()
实际上是多余的,因为以下句子的逻辑暗示了*p1
或*p2
为0时的期望结果。但是,通过保留那些if()
,代码变得更可读 这里,如果我没弄错的话,你想要我描述的东西如下:
一种不区分大小写的排序,在tie下,将使用TIEBRAKER条件“小写优先”
所以就像:
中较早的字母_
Level Description Examples
L1 Base characters role < roles < rule
L2 Accents role < rôle < roles
L3 Case/Variants role < Role < rôle
L4 Punctuation role < “role” < Role
Ln Identical role < ro□le < “role”
b < B < bad < Bad < bäd
c < C < cote < coté < côte < côté
// We assume that the collating sequence satisfies the following rules:
// 'A' < 'B' < 'C' < ...
// 'a' < 'b' < 'c' < ...
// We don't make any other assumptions.
#include <ctype.h>
int my_strcmp(const char * s1, const char * s2)
{
const char *p1 = s1, *p2 = s2;
while(*p1 == *p2 && *p1 != '\0' && *p2 != '\0')
p1++, p2++; /* keep searching... */
if (*p1 == *p2)
return 0;
if (*p1 == '\0')
return -1;
if (*p2 == '\0')
return +1;
char c1 = tolower(*p1), c2 = tolower(*p2);
int u1 = isupper(*p1) != 0, u2 = isupper(*p2) != 0;
if (c1 != c2)
return c1 - c2; // <<--- Alphabetical order assumption is used here
if (c1 == c2)
return u1 - u2;
}
return (c1 != c2)? c1 - c2: u1 - u2;
#include <ctype.h> // for tolower and islower
int my_character_compare(const char a, const char b)
{
int my_result;
my_result = tolower(a) - tolower(b);
// unless it is zero, my_result is definitely the result here
// Note: if any one of them was 0, result will also properly favour that one
if (my_result == 0 && a != b)
// if (could not be distinguished with #1, but are different)
{
// means that they are case-insensitively same
// but different...
// means that one of them are lowercase, the other one is upper
if (islower(a))
return -1; // favour a
else
return 1; // favour b
}
// regardless if zero or not, my_result is definitely just the result
return my_result;
}
int my_string_compare(const char * a, const char * b)
{
int my_result;
my_result = my_character_compare(*a, *b);
// unless it is zero, my_result is definitely the result here
while (my_result == 0 && *a != 0)
// current characters deemed to be same
// if they are not both just 0 we will have to check the next ones
{
my_result = my_character_compare(*++a, *++b);
}
// whatever the my_result has been:
// whether it became != zero on the way and broke out of the loop
// or it is still zero, but we have also reached the end of the road/strings
return my_result;
}
#include <ctype.h>
int my_string_compare(const char * a, const char * b)
{
int my_result;
while (*a || *b)
{
if ((my_result = tolower(*a) - tolower(*b)))
return my_result;
if (*a != *b)
return (islower(*a)) ? -1 : 1;
a++;
b++;
}
return 0;
}
Keeping things together: Simple "M after m":
------------------------ -------------------
mars mars
mars bar mars bar
Mars bar milk
milk milk-duds
Milk milky-way
milk-duds Mars bar
milky-way Milk
Milky-way Milky-way
int alphaBetize (const char *a, const char *b) {
int r = strcasecmp(a, b);
if (r) return r;
/* if equal ignoring case, use opposite of strcmp() result to get
* lower before upper */
return -strcmp(a, b); /* aka: return strcmp(b, a); */
}
#ifdef I_DONT_HAVE_STRCASECMP
int strcasecmp (const char *a, const char *b) {
while (*a && *b) {
if (tolower(*a) != tolower(*b)) {
break;
}
++a;
++b;
}
return tolower(*a) - tolower(*b);
}
#endif
int alphaBetize (const char *a, const char *b) {
int weight = 0;
do {
if (*a != *b) {
if (!(isalpha(*a) && isalpha(*b))) {
if (isalpha(*a) || isalpha(*b)) {
return isalpha(*a) - isalpha(*b);
}
return *a - *b;
}
if (tolower(*a) != tolower(*b)) {
return tolower(*a) - tolower(*b);
}
/* treat as equal, but mark the weight if not set */
if (weight == 0) {
weight = isupper(*a) - isupper(*b);
}
}
++a;
++b;
} while (*a && *b);
/* if the words compared equal, use the weight as tie breaker */
if (*a == *b) {
return weight;
}
return !*b - !*a;
}
const char * alphaBetical =
"aAbBcCdDeEfFgGhHiIjJkKlLmMnNoOpPqQrRsStTuUvVwWxXyYzZ";
int alphaBeta_lookup (int c) {
static int initialized;
static char table[CHAR_MAX+1];
if (!initialized) {
/* leave all non-alphaBeticals in their relative order, but below
alphaBeticals */
int i, j;
for (i = j = 1; i < CHAR_MAX+1; ++i) {
if (strchr(alphaBetical, i)) continue;
table[i] = j++;
}
/* now run through the alphaBeticals */
for (i = 0; alphaBetical[i]; ++i) {
table[(int)alphaBetical[i]] = j++;
}
initialized = 1;
}
/* return the computed ordinal of the provided character */
if (c < 0 || c > CHAR_MAX) return c;
return table[c];
}
int alphaBetize (const char *a, const char *b) {
int ax = alphaBeta_lookup(*a);
int bx = alphaBeta_lookup(*b);
int weight = 0;
do {
char al = tolower(*a);
char bl = tolower(*b);
if (ax != bx) {
if (al != bl) {
return alphaBeta_lookup(al) - alphaBeta_lookup(bl);
}
if (weight == 0) {
weight = ax - bx;
}
}
ax = alphaBeta_lookup(*++a);
bx = alphaBeta_lookup(*++b);
} while (ax && bx);
/* if the words compared equal, use the weight as tie breaker */
return (ax != bx) ? !bx - !ax : weight;
}
int simple_collating (const char *a, const char *b) {
while (alphaBeta_lookup(*a) == alphaBeta_lookup(*b)) {
if (*a == '\0') break;
++a, ++b;
}
return alphaBeta_lookup(*a) - alphaBeta_lookup(*b);
}
/*
* To change the collating locale, use (for example):
setlocale(LC_COLLATE, "en.US");
*/
int iso_collating (const char *a, const char *b) {
return strcoll(a, b);
}
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
//Global Variables Required
//=========
const char *tgt[]={"bacon", "Bacon", "mIlk",
"Milk", "spinach", "MILK", "milk", "eggs"}; //Array for sorting
int tgt_size=8; //Total Number of Records
int SortLookupTable[128]; //Custom sorting table
typedef int cmp_t (const void *, const void *);
int main(int argc, char *argv[]) {
printf("Before sort:\n\n");
int n=0;
for(n=0; n<tgt_size; n++)
printf("%s\n", tgt[n]);
CreateSortTable();
myQsort(tgt, tgt_size, sizeof(char *), &compare);
printf("\n\n====\n\n");
for(n=0; n<tgt_size; n++)
printf("%s\n", tgt[n]);
return 0;
}
void CreateSortTable(void){
int i;
for (i = 0; i < 128; i++){
SortLookupTable[i] = 0;
}
char *s;
s=(char *)malloc(64);
memset(s, 0, 64);
strcpy(s, "aAbBcCdDeEfFgGhHiIjJkKlLmMnNoOpPqQrRsStTuUvVwWxXyYzZ");
i=1;
for (; *s; s++){
SortLookupTable[(int) ((unsigned char) *s)]=i;
i++;
}
}
//Some important Definations required by Quick Sort
=======
#define SWAPINIT(a, es) swaptype = ((char *)a - (char *)0) % sizeof(long) || \
es % sizeof(long) ? 2 : es == sizeof(long)? 0 : 1;
#define swap(a, b) \
if (swaptype == 0) { \
long t = *(long *)(a); \
*(long *)(a) = *(long *)(b); \
*(long *)(b) = t; \
} else \
swapfunc(a, b, es, swaptype)
#define vecswap(a, b, n) if ((n) > 0) swapfunc(a, b, n, swaptype)
#define swapcode(TYPE, parmi, parmj, n) { \
long i = (n) / sizeof (TYPE); \
register TYPE *pi = (TYPE *) (parmi); \
register TYPE *pj = (TYPE *) (parmj); \
do { \
register TYPE t = *pi; \
*pi++ = *pj; \
*pj++ = t; \
} while (--i > 0); \
}
#define min(a, b) (a) < (b) ? a : b
//Other important function
void swapfunc(char *a, char *b, int n, int swaptype){
if(swaptype <= 1)
swapcode(long, a, b, n)
else
swapcode(char, a, b, n)
}
char * med3(char *a, char *b, char *c, cmp_t *cmp){
if ( cmp(a, b) < 0){
if (cmp(b, c) < 0){
return b;
}else{
if ( cmp(a, c) < 0){
return c;
}else{
return a;
}
}
}else{
if (cmp(b, c) < 0){
return b;
}else{
if ( cmp(a, c) < 0){
return a;
}else{
return c;
}
}
}
}
//Custom Quick Sort
void myQsort(void *a, unsigned int n, unsigned int es, cmp_t *cmp){
char *pa, *pb, *pc, *pd, *pl, *pm, *pn;
int d, r, swaptype, swap_cnt;
loop: SWAPINIT(a, es);
swap_cnt = 0;
if (n < 7) {
for (pm = (char *)a + es; pm < (char *)a + n * es; pm += es)
for (pl = pm; pl > (char *)a && cmp(pl - es, pl) > 0; pl -= es){
swap(pl, pl - es);
}
return;
}
pm = (char *)a + (n / 2) * es;
if (n > 7) {
pl = a;
pn = (char *)a + (n - 1) * es;
if (n > 40) {
d = (n / 8) * es;
pl = med3(pl, pl + d, pl + 2 * d, cmp);
pm = med3(pm - d, pm, pm + d, cmp);
pn = med3(pn - 2 * d, pn - d, pn, cmp);
}
pm = med3(pl, pm, pn, cmp);
}
swap(a, pm);
pa = pb = (char *)a + es;
pc = pd = (char *)a + (n - 1) * es;
for (;;) {
while (pb <= pc && (r = cmp(pb, a)) <= 0) {
if (r == 0) {
swap_cnt = 1;
swap(pa, pb);
pa += es;
}
pb += es;
}
while (pb <= pc && (r = cmp(pc, a)) >= 0) {
if (r == 0) {
swap_cnt = 1;
swap(pc, pd);
pd -= es;
}
pc -= es;
}
if (pb > pc)
break;
swap(pb, pc);
swap_cnt = 1;
pb += es;
pc -= es;
}
if (swap_cnt == 0) { /* Switch to insertion sort */
for (pm = (char *)a + es; pm < (char *)a + n * es; pm += es)
for (pl = pm; pl > (char *)a && cmp(pl - es, pl) > 0;
pl -= es)
swap(pl, pl - es);
return;
}
pn = (char *)a + n * es;
r = min(pa - (char *)a, pb - pa);
vecswap(a, pb - r, r);
r = min(pd - pc, pn - pd - es);
vecswap(pb, pn - r, r);
if ((r = pb - pa) > es)
myQsort(a, r / es, es, cmp);
if ((r = pd - pc) > es) {
/* Iterate rather than recurse to save stack space */
a = pn - r;
n = r / es;
goto loop;
}
}
unsigned char Change(char a){
return (unsigned char ) SortLookupTable[(int)a];
}
int compare (const void *a, const void *b){
char *s1= *(char **)a;
char *s2= *(char **)b;
int ret, len, i;
ret=0;
if (strlen((void*)s1) > strlen((void*)s2)){
len=strlen((void*)s1);
}else{
len=strlen((void*)s2) ;
}
for(i=0; i<len; i++){
if ( s1[i] != s2[i]){
if ( Change(s1[i]) < Change(s2[i]) ){
ret=0;
break;
}else{
ret=1;
break;
}
}
}
return ret;
}
#include <stdio.h>
#include <limits.h>
/*
* Initialize an index array associated with the collating
* sequence in co. The affected array can subsequently be
* passed in as the final client data pointer into qsort_r
* to be used by collating_compare below.
*/
int
collating_init(const char *co, int *cv, size_t n)
{
const unsigned char *uco = (const unsigned char *) co;
const unsigned char *s;
size_t i;
if (n <= UCHAR_MAX) {
return -1;
}
for (i = 0; i < n; i++) {
/* default for chars not named in the sequence */
cv[i] = UCHAR_MAX;
}
for (s = uco; *s; s++) {
/*
* the "collating value" for a character's own
* character code is its ordinal (starting from
* zero) in the collating sequence. I.e., we
* compare the values of cv['a'] and cv['A'] -
* rather than 'a' and 'A' - to determine order.
*/
cv[*s] = (s - uco);
}
return 0;
}
static int
_collating_compare(const char *str1, const char *str2, int *ip)
{
const unsigned char *s1 = (const unsigned char *) str1;
const unsigned char *s2 = (const unsigned char *) str2;
while (*s1 != '\0' && *s2 != '\0') {
int cv1 = ip[*s1++];
int cv2 = ip[*s2++];
if (cv1 < cv2) return -1;
if (cv1 > cv2) return 1;
}
if (*s1 == '\0' && *s2 == '\0') {
return 0;
} else {
return *s1 == '\0' ? -1 : 1;
}
}
int
collating_compare(const void *v1, const void *v2, void *p)
{
return _collating_compare(*(const char **) v1,
*(const char **) v2,
(int *) p);
}
gcc -DMAIN_TEST -Wall -o custom_collate_sort custom_collate_sort.c
#if defined(MAIN_TEST)
/* qsort_r is a GNU-ey thing... */
#define __USE_GNU
#include <stdlib.h>
#include <string.h>
#define NELEM(x) (sizeof x / sizeof 0[x])
static int
cmp(const void *v1, const void *v2)
{
return strcmp(*(const char **) v1, *(const char **) v2);
}
static int
casecmp(const void *v1, const void *v2)
{
return strcasecmp(*(const char **) v1, *(const char **) v2);
}
int
main(int ac, char *av[])
{
size_t i;
int cval[256], ret;
int cval_rev[256], rret;
char *tosort[] = {
"cheeSE", "eggs", "Milk", "potatoes", "cheese", "spaghetti",
"eggs", "milk", "spinach", "bacon", "egg", "apple", "PEAR",
"pea", "berry"
};
ret = collating_init("aAbBcCdDeEfFgGhHiIjJkKlLmMnNoOpPqQrRsStTuUvVxXyYzZ",
cval, NELEM(cval));
rret = collating_init("ZzYyXxVvUuTtSsRrQqPpOoNnMmLlKkJjIiHhGgFfEeDdCcBbAa",
cval_rev, NELEM(cval_rev));
if (ret == -1 || rret == -1) {
fputs("collating value array must accomodate an index of UCHAR_MAX\n", stderr);
return 1;
}
puts("Unsorted:");
for (i = 0; i < NELEM(tosort); i++) {
printf(" %s\n", tosort[i]);
}
qsort((void *) tosort, NELEM(tosort), sizeof tosort[0], cmp);
puts("Sorted w/ strcmp:");
for (i = 0; i < NELEM(tosort); i++) {
printf(" %s\n", tosort[i]);
}
qsort((void *) tosort, NELEM(tosort), sizeof tosort[0], casecmp);
puts("Sorted w/ strcasecmp:");
for (i = 0; i < NELEM(tosort); i++) {
printf(" %s\n", tosort[i]);
}
qsort_r((void *) tosort, NELEM(tosort), sizeof tosort[0],
collating_compare, (void *) cval);
puts("Sorted w/ collating sequence:");
for (i = 0; i < NELEM(tosort); i++) {
printf(" %s\n", tosort[i]);
}
qsort_r((void *) tosort, NELEM(tosort), sizeof tosort[0],
collating_compare, (void *) cval_rev);
puts("Sorted w/ reversed collating sequence:");
for (i = 0; i < NELEM(tosort); i++) {
printf(" %s\n", tosort[i]);
}
return 0;
}
#endif /* MAIN_TEST */